Has_parent query performance

Hello everyone,

I have a case where I have two types of documents (Doc1 & Doc2). There are about 100 Doc1 (parent) and 2000000 Doc2 (child) and in the case of Doc1 they have duplicated data from other Doc1s and this data can be updated. The scenario is similar to a filesystem where the directories have some metadata and this metadata is copied to its child directories. Also, the child documents are large while the parent only has a few small fields.

Since modifying the metadata at the root directory will cause the 2000000 documents to reindex the option of not using a parent-child relationship is discarded. So, in the documentation of the has_parent query it says "Its performance degrades as the number of matching parent documents increase", but in my case the number of parents is really small compared to child and in my case the parent will be used for match queries and terms filtering.

Will the use of has_parent query have a significant impact in (a) performance and (b) heap usage?

Thanks in advance

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.