Slow first query for has_children


(Carlos Carrasco) #1

While evaluating the parent-child feature of Elastic Search I've noticed
that the first query using has_children after a server restart or a bulk
import takes around 5 minutes (parent type has 17 million docs, children
types have 10 million, ES 0.19.0-RC1). I am aware that ES needs to load
into memory the full _id index for the parent, which is OK, but is there a
way to force this load upon server start, and not have it lazy loaded when
the first has_children query arrives? Or any controlled way to trigger it
so it can be scripted on startup instead of relying in a search query?


(Karussell) #2

I don't think so. But there is a feature request for autowarming:

you'll need to do it manually

Peter.

On Feb 8, 12:29 pm, Carlos Carrasco carlos.carra...@groupalia.com
wrote:

While evaluating the parent-child feature of Elastic Search I've noticed
that the first query using has_children after a server restart or a bulk
import takes around 5 minutes (parent type has 17 million docs, children
types have 10 million, ES 0.19.0-RC1). I am aware that ES needs to load
into memory the full _id index for the parent, which is OK, but is there a
way to force this load upon server start, and not have it lazy loaded when
the first has_children query arrives? Or any controlled way to trigger it
so it can be scripted on startup instead of relying in a search query?


(Shay Banon) #3

Once the data has been started, you can send the relevant queries yourself to warm it. As I explained in the issue, auto warming is problematic as it will affect indexing.

On Wednesday, February 8, 2012 at 3:07 PM, Karussell wrote:

I don't think so. But there is a feature request for autowarming:

https://github.com/elasticsearch/elasticsearch/issues/1006

you'll need to do it manually

Peter.

On Feb 8, 12:29 pm, Carlos Carrasco <carlos.carra...@groupalia.com (http://groupalia.com)>
wrote:

While evaluating the parent-child feature of Elastic Search I've noticed
that the first query using has_children after a server restart or a bulk
import takes around 5 minutes (parent type has 17 million docs, children
types have 10 million, ES 0.19.0-RC1). I am aware that ES needs to load
into memory the full _id index for the parent, which is OK, but is there a
way to force this load upon server start, and not have it lazy loaded when
the first has_children query arrives? Or any controlled way to trigger it
so it can be scripted on startup instead of relying in a search query?


(system) #4