We have set an elasticsearch cluster that is consisted total of 8 nodes, where 6 of them are eligible master node and data node. The rest 2 are data only node. And they are all same spec servers.
As we do performance test, we make pretty high tps (searching requests).. say, 800 tps in average for hours.
Then a master node's write IOPS spikes, which results in high load average..
I would like to know why heavy writing happens on master node much more than on data nodes.
For example, master node's IOPS hits about 50 while other nodes IOPS rarely increases to 5.
If this is the volume you expect in your cluster you shouldn't let a data node double as master, as that will affect its ability to maintain cluster state during heavy work loads.
Instead, you should consider setting up 3 dedicated master eligible nodes (with node.master: true and node.data: false) that holds no index data and let the remaining 5 nodes be data only (node.master: false). An advantage with this setup is that master eligible nodes can be on lightweight servers with less disc space and fewer CPU cores than the heavy duty data nodes that must handle both indexing and searches. I use this separation of data and master eligible nodes in my clusters and that has served me well so far
A master node normally don't have heavy writing as it stores very little data so my best guess is that the heavy writing you saw came from the data node part (indexing, shard updating, segment merging etc), which is why I suggested using dedicated masters and dedicated data nodes in your cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.