Hello,
My company has made changes and for new servers we only use RHEL OS. So for current configuration I have mixed OS cluster (Elasticsearch v 8.11.1):
1x Master Node - Debian 12.2
2x Data Nodes - Debian 12.2
2x Data Nodes - RHEL 9.3
I recently added these two new data nodes with RHEL operating system and I noticed poor performance on these nodes, all four data nodes have identical hardware and configuration. Is the problem due to a different OS? Should I migrate all nodes to RHEL systems?
I attach an image with I/O Operations Rate on each Node and you can see that Debian servers have about 1000-1500 Write I/O operations while RHEL servers have only about 200-500 I/O operations
Are you running with a single dedicated master node? If you are, be aware that that is very risky and you could lose the full cluster and all data if this fails. Always aim to have at 3 master eligible nodes in a cluster (if you do not already).
As you have just added these new nodes it is possible that the shard distribution is not equal and that some nodes have more shards or perhaps just more active shards. Identify the indices that receive the most indexing and querying and use the _cat shards API to see how these are distributed across the data nodes. If you eliminate this it would also be useful to look at OS level tuning, especially disk I/O related settings, that could differ and result in different I/O levels.
If all this turns up no differences an upgrade of all nodes may be useful.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.