I have tribe nodes that look at multiple clusters, but every time there's a index rotation or any sort of maintenance, the tribe node goes down until things are allocated and back to normal which could take over an hour.
I heard there was a patch that was given to blizzard for the tribe node to ignore cluster events. I was wondering if I could get a similar patch or tell me how to patch it?
Completely ignoring cluster "events" is inherently dangerous yet there are aspects of cluster state updates that are expensive yet can be ignored on tribe nodes.
basically when the indices rotate (create new ones and delete old ones), the tribe node goes in a red state saying it can't connect to elasticsearch. This happens for an hour sometimes as the clusters are trying to stabilize. We have users trying to access the tribe nodes but get errors. We are looking to reduce the downtime of the tribe node. And when I was listening to blizzard (which looks like we have a similar setup) they mentioned there was a patch that was applied that helped with the downtime of the tribe node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.