I have a 49 node ES cluster with roughly the same size storage on each node (900GB). Most indices have 3-5 primary shards. Indices and shards can be of very variable size.
Every so often, all primary shards of one of the larger indices land on the same ES node. This can lead to disk usage hitting high water marks or just drop ingestion rate which causes lag in our logging pipeline.
We already use node.max_local_storage_nodes: 1 and shard allocation awareness based on node attributes, but as far as I know that will just make sure that both primary and replica shards will not be on the same node...
that certainly sounds what I'm looking for Thank you!
Unfortunately this does not seem to work as expected
It is always APM indices it seems so maybe it is not a Elasticsearch issue but a APM Server thing? Or a ES ingest pipeline thing... I need to dig deeper I guess.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.