Parent-child query speed


(Kamil) #1

Hi,

I've got sme problems with Elasticsearch has_child query speed. I know that many people report that it is slow, but for me the query to retrieve parents is more than 80 times slower than the same query on children. Moreover it got 2 times slower when I reindexed the data using Elasticsearch 2.1 (compared to 1.7).

There is around 37M child documents, and 32M parent documents in the index. On both of them the custom function_score is used to compute the results. The typical use case is to create quite complicated query using nested documents on children then to wrap it in has_child query, and then use maximal score of children to compute the score of returned parents.

The average time of has_child query was around 600ms with Elasticsearch 1.7, and is around 1200ms with Elasticsearch 2.1 on the machine with 30GB of ram (ES has 16GB of heap).

Is there anything I can do to improve those speeds or should I just resign from using parent-child relationships? The slowdown stated in documentation (5-10x) would be just fine.


(Christian Dahlqvist) #2

As you seem to have very few child child documents per parent, I would try flattening the structure through denormalisation and see how it performs without using parent-child relationships.


(system) #3