Has_parent query performance

maariolopezz · June 1, 2022, 10:00am

Hello everyone,

I have a case where I have two types of documents (Doc1 & Doc2). There are about 100 Doc1 (parent) and 2000000 Doc2 (child) and in the case of Doc1 they have duplicated data from other Doc1s and this data can be updated. The scenario is similar to a filesystem where the directories have some metadata and this metadata is copied to its child directories. Also, the child documents are large while the parent only has a few small fields.

Since modifying the metadata at the root directory will cause the 2000000 documents to reindex the option of not using a parent-child relationship is discarded. So, in the documentation of the has_parent query it says "Its performance degrades as the number of matching parent documents increase", but in my case the number of parents is really small compared to child and in my case the parent will be used for match queries and terms filtering.

Will the use of has_parent query have a significant impact in (a) performance and (b) heap usage?

Thanks in advance

system · June 29, 2022, 10:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parent-child query speed Elasticsearch	2	848	July 5, 2017
Parent-child performance issue Elasticsearch	7	4987	July 5, 2017
Query with has_parent vs denormalized with lots of updates Elasticsearch	2	557	January 26, 2018
Performance penalty for has_child queries Elasticsearch	15	1898	July 6, 2017
Parent-Child Relationship performance Elasticsearch	1	332	May 1, 2022

Has_parent query performance

Related topics