Understanding performance impact of parent/child mapping

Emilie_Lavigne · November 26, 2013, 9:54pm

I am investigating whether the parent/child option is viable for our use
case. I would like a few clarifications on how the id cache is populated.

What gets loaded into the _id cache? All document _ids or only parent
_ids?
Are child -> parent mappings also loaded into the cache?
If so, if a child defines a non-existing parent at index time, will
that child-parent mapping still get loaded into the cache?

To be clearer, we have about 2 trillion documents indexed across 4 nodes,
with a total of 400GB of RAM dedicated to ElasticSearch. In our use case,
only about 10% of documents will likely have a parent to point to. Most
will be orphans. I am trying to understand whether there is a way to
prevent orphan documents (ie: documents that would point to a non-existing
parent such as "NA") from having an impact on the heap memory.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/501e2e14-f62b-42d0-a4b7-75ffed80096a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Parent ID cache unexpectedly large and growing Elasticsearch	4	644	July 6, 2017
Parent/Child inserting _ids in memory while indexing Elasticsearch	4	326	July 6, 2017
Id_cache is growing at index time, without warmers or queries being run Elasticsearch	2	427	July 6, 2017
Parent/Child query performance in version 1.1.2 Elasticsearch	7	450	July 6, 2017
_parent cache investigation Elasticsearch	2	431	July 5, 2017

Understanding performance impact of parent/child mapping

Related topics