I’m curious if anyone is able to comment on the handling of nested documents at query time. Are nested documents skipped over in the
nextDoc step, or must they also be considered if, for example, nested docs share some field names with the parent documents?
And somewhat relatedly, how does the ratio of root:nested documents affect performance?
Thanks in advance!
Nested docs should not share the same field names as a parent doc, unless you do
include_in_root, which we for a long time are thinking to deprecate.
But if you ended up using these parameters, and have shared fields, then if you are not using
nested query, the query will be run only on top level docs as we are internally rewriting this query to add an additional filter that will skip nested docs.
Thanks for your reply and the info @mayya. In hindsight, I think my true question would have been more clear if I had omitted the part about shared fields.
To follow up: when searching an index with nested fields, can nested docs be efficiently skipped if there are no
nested queries (i.e. the nested fields are not part of the query)? To what degree will performance be affected if the number of nested documents hugely outnumbers the number of parent documents?
I am not super clear what you mean by "efficiently skipped", but considering that you mentioned
nextDoc, I think you are talking about a postings list. If you are searching with query on a field "X", the postings list for this field will contain only documents that have this field. In your 2nd scenario this will be only be parent documents, so the number of nested documents is not relevant here, and doesn't have an impact on the speed of this query.
Thanks @mayya, sorry about the confusing use of the wrong terminology. You’ve answered my question
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.