The fetch part returns additional information for the hit (even if you don't ask for _source), for example, loading the actual _id of the document (the "query" phase streams internal ids of the doc, the "Lucene ids"). Its strange that this is what actually takes most of the time to be honest, can you gist the slow log that shows it?
Also, Simon comments on the warmer API (which will affect the query phase, as you identified correctly) are still valid, and 0.20 will come out early next week.
I should add that it seems that most of the query time comes from fetch (not from query.)
The query takes like 2s and the fetch takes like 56s.
Through slowquery log we noticed a node in cluster that seems to always have the slowest time, so we will probably be replacing it.
But anyway since we're not retrieving _source (fields:), do you have any idea of what ES could still be "fetching" from the index?
PS: I work in the same place with Felipe Hummel
On Sat, Oct 20, 2012 at 6:24 AM, simonw simon.willnauer@elasticsearch.com wrote:
hey,
On Friday, October 19, 2012 10:59:03 PM UTC+2, Felipe Hummel wrote:
We actually started implementing the calls to Index Warmer API earlier today, but realized it was only for 0.20.
Coming soon like next weeks or next months?
its rather like weeks I'd say.
Is it reasonable to use the elasticsearch master? Would be upgrading from 0.19.8.
I mean it would be good to verify that warmer api helps you! so yeah go and try!
simon
Thanks!
Felipe Hummel
On Fri, Oct 19, 2012 at 4:49 PM, simonw simon.w...@elasticsearch.com wrote:
hey, this seems very much like a DataCache loading problem. The script you are using requires all values for "created_at" to be loaded into memory. This happens by un-inverting the field once the first query hits an index segment. Usually this is fast but if you occasionally hit a big segment merge in the background large segments need to be loaded and that can easily take that long. We refer to this as segment warming, in 0.20 (likely the next release coming soon) we have a warmer API that allows to pre-warm new segments so that you won't suffer from loading FieldDataCaches. If you can wait until 0.20 this should get resolved!
simon
On Friday, October 19, 2012 10:12:23 PM UTC+2, Felipe Hummel wrote:
Hi, we have a ~57GB index (with index compression) with ~36 million documents. 4 EC2 machines (7.4GB RAM 2-cores) with 2 primary shards each (no replicas).
We are not in production yet, so the load is basically developers and other employees playing around with the application.
Yet, we are seeing some very slow queries from time to time. Normally a query takes from 100ms up to 1100ms max! But sometimes one innocent query takes up 20 seconds, 30 seconds. Today we saw one taking 55 seconds. But right after, when we do the query again it takes the usual 100-1100ms.
The query general format is embedded below, note that we're not retrieving _source (fields: )
{
"size" : 250,
"query" : {
"custom_score" : {
"query" : {
"query_string" : {
"query" : "...",
"fields" : [ "myField^2.0", "myField2" ],
"default_operator" : "and"
}
},
"script" : "a = doc['created_at'].value) * param; _score / ( a*log(a) )",
"params" : {
"param1" : 123123123
}
}
},
"filter" : {
"bool" : {
"must" : {
"range" : {
"created_at" : {
"from" : null,
"to" : "2012-10-17T20:25:10.228Z",
"include_lower" : true,
"include_upper" : true
}
}
}
}
},
"fields" :
}
The machines have around 4GB being used (ES memory is set: -Xms4G -Xmx4G) and the rest in OS cache (swapping is disabled).
Almost all ES config is default (merge policies for example). We're on ES version 0.19.8.
We know it may be some sporadic thing like merging big segments or GC'ing. But 50 seconds is way to strange.
Any ideas?
Thanks
Felipe Hummel
--
--
--