Indices recovering after a red - yellow state leads to writes stucked?

There is something I don't understand, and I don't know if it is something peculiar.

When my cluster ES (42 nodes with 32GB, 20000 shards in a old 5.6 version = we aim to go to a 7.x at the end of Q1 2022, we must tolerate this situation still few months) falls down, it begins to be red (in the worst case) then yellow and I clearly see that all my daily indices based on primary shards are quickly recovered = so far, so well.

I clearly see, in the "shards activity monitor" (kibana) "INDEX" operations related to a rebuild of their replica(s) = still good here ?

But what I don't understand deeply enought is why the Spark Streaming that writes data seems stuck (bulks).
In the first time I thought it was because of replicas unavailability... But, I'm not sure anymore.
Is it because some indices are in read only during those operations ?
Is it because the cluster load (slowed down by INDEX operations) is too heavy ?
Something else ?

Thanks for your advices :wink:


5.6 is super old and there's been tonnes of improvements around sharding between it and 7.X. It's highly likely that's part of the issue here.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.