_parent cache investigation

jonatanzafar59 · March 8, 2016, 11:26am

Hi,

Background:
For the past 2 months, we had 4 servers in 18 servers cluster, that went out of order (3 times longer GCs)
After deep digging into the cluster's memory usage, we noticed the _parent field caching is high exactly in the problematic servers.
_parent field caching is the result of the parent-child relationship. ElasticSearch holds parent-child mapping in memory (one to many), which keeps every parent’s _id string in memory, and the corresponding children.

We tried –
• Analyzing children/parent spread across shards, and detect to an anomaly. No anomaly was found.
• Remove the data for the problematic servers and make it replicate.
• Remove the entire server and recreate it.
• Switch shards between a problematic server and a healthy one.

None of the above worked. What made the red light pop up was the last try, which made the healthy server problematic but did not make the problematic healthy. I started wondering if this cache behaves differently from the rest of the field data, and not cleaned up ever.

I searched for clean cache API for the _parent field, and found one, and run it against the cluster. This, as expected, cleaned this cache.
I was expecting the cache to spike again when a parent-child query will occur, but it didn’t happen.
Since yesterday, we had thousands of parent-child queries, but none of them raised the cache.

Can you please help me this case deeper?

ElasticSearch version 1.6.0
Thank you.

jpountz · March 8, 2016, 1:22pm

I agree this is confusing. For the record, the parent id cache is almost entirely stored on disk from 2.0 onwards.

Topic		Replies	Views
hasParentQuery search slow down fielddata _parent memory clear Elasticsearch	1	738	July 5, 2017
Parent ID cache unexpectedly large and growing Elasticsearch	4	669	July 6, 2017
Understanding performance impact of parent/child mapping Elasticsearch	1	307	July 6, 2017
Parent field is sometimes empty Elasticsearch	2	312	July 6, 2017
Has_child performance - alternative implementation Elasticsearch	2	473	July 6, 2017

_parent cache investigation

Related topics