Cluster keeps doing something after node failure

Hi guys,
We are running a cluster with 3 nodes in prod. After a node failure cluster
started index reallocation. This process took several days with small
pauses (for hours each). When the process finished the cluster looked fine

  • everything worked fast, workload was pretty low. After several days the
    "rebalancing" or something else started again. The node 01 (which failed
    earlier) has almost 100% CPU load and network traffic ~100 mbit from node
    02 to node 01. The problem is that we have issues with ElasticSearch for
    more that a week!

I have two questions

  1. What and why elasticsearch is doing?
  2. Why ES started "doing something" again after >24h pause?
  3. How to stop this process or understand when it gets finished? Our cliens
    are not happy with this terrible lags while ES is doing something

I've attached a screenshot from Zabbix and our ES config which is equal for
all nodes.

Thanks in advance for any help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e238363d-a947-4a8b-8a84-203452edf073%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.