Shards getting marked as stale frequently causing cluster to go yellow

I meant its array of strings (with duplicates). Eventually a string will be there in memory when the JSON is parsed right (for the ctx._source), be it the document or the request that is sent. I was talking about that

And yes, even the JSON object could grow too

Reducing the JSON size seems to have done the trick. We'll observe for a few more days. But before that, can you tell me what's the impact of increasing g1heapregionsize? What should I expect if I adjust it from 8mb to 16mb? I'll go thru our GC logs and see how many humongous regions are being created and compare it with previous incidents to verify if we should even tweak region size

Almost a week without circuit breaker tripping. Looks like it has helped!

Just a minor concern, do you happen to know how g1gc behaves when snapshots are taken? We see a spike in our young GC time (goes upto 500ms-1s). We also see number of young GC to increase. So maybe the spike in time is a consequence of frequency of GC

Elasticsearch however wasn't affected in any noticeable way. The way young gen time is provided in the node stats (via JVM API's most likely), does it take into consideration time taken for concurrent phases, or only paused phases? We also have seen an increase in time taken to dedupe strings when snapshots are being taken. My guess is G1GC is taken by surprise because of the memory pressure the snapshot puts onto the heap

Any recommendations to reduce young gen frequency

More info available here

It's certainly possible: taking a snapshot does involve moving a load of data around and maybe we're creating more garbage than needed. There's been a lot of work on streamlining the snapshot process since 7.2. Off the top of my head I'm not sure if any of it would directly affect memory pressure but it might. Can you reproduce this in 7.7?

I'll need to check that @DavidTurner
We may plan for an in-place upgrade as elasticsearch has already reached 7.7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.