Query performance

I am currently trying to improve/understand my query performance. The
queries do a full text search, perform a sort, and utilize a terms filter.
My understanding is the sort will use the field data cache and the filter
uses the filter cache. Sometimes searches return in a few hundred
milliseconds, other times it takes multiple seconds. The cluster runs on 4
nodes and continuously indexes new documents.

What is the best way to understand why some queries take significantly
longer? My guess is that it has to do with the state of the caches on the
particular node(s) that the query executes on, but is there a way to verify
this? It seems I can get an explain plan of the query, but not a breakdown
of the timings spent in the filter/query/sort operations?

I was thinking of trying to change the terms filter execution option to
fielddata, but the filter is on a fairly long string terms. My
understanding that field data usage is mostly proportional to the number of
unique values of the field in the index rather than the number of
documents. Is this correct?

As documents are continuously being indexed, a warmer to load the sort
field into the field data cache would also make sense? I was also reading
thishttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html#_fielddata_loadingin the documentation concerning field data loading, but I didn't quite
understand it. Is "category" an example of a field in the index? Overall
load on the cluster is low, but I am wondering what is the impact of
indexing and querying at the same time.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/77e00ef4-4d4c-4894-a1c3-4caa4be47264%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.