ES cluster throughput drops with 6 node cluster

Hi,
I am trying to benchmark ES cluster. The cluster has a single index with single shard. We need to benchmark no. of read operations supported by the cluster.

If we have 3 master eligible data nodes (all data nodes having one copy of index), the cluster gives approx 90 throughput (requests/sec). Here we have one primary and 2 replicas of the index. The index has single shard.

On adding one more data node to the cluster (which means 4 data node ES cluster) and setting replica count as 3 for the index, the cluster gives throughput approx 120 requests/sec. Kibana graph shows that all nodes have almost same load and CPU utilization.

On adding one more data node to the cluster (5 data nodes ES cluster) and setting replica count as 4 for the index, the cluster gives throughput approx 150 requests/sec. Kibana graph shows that all nodes have almost same load and CPU utilization.
Till this point, we are getting additional throughput of nearly 30 requests/sec on adding a node to the cluster.

When we add one more data node to the cluster (6 data nodes) and set replica count as 5 for the index, so that every node has a copy of the index, cluster throughput falls to approx 120 to 125 requests/sec. Also kibana graph shows that only one node is getting utilized more as compared to other nodes. Ideally the throughput served by 6 data nodes of same size should have been ~ 180 req/sec.

Here we are having each node with 2 CPUs and 8GB memory (RAM).

Can someone help in understanding this behavior?
Is there any other better way to scale the cluster?
We want to benchmark read requests as our index will be having more read operations compared to write operations.

1 Like

How large is your index? How are you distributing requests across the nodes? What type of storage do you have?

Hi,
Primary index is around 3 GB. Do we need to distribute requests across the nodes? ES takes care of distributing the requests across nodes having replica.
I am using AWS EBS volumes for storage.
Thanks.

What instance types are you using? What is the total size of all indices?

I am using m5.large instances and there is only one index present in the cluster apart from the monitoring indices created by ES (.monitoring-es-6-YYYY.MM.DD and .monitoring-kibana-6-YYYY.MM.DD). I wanted to benchmark how much throughput we get with 3 node cluster and was trying to figure out how much throughput increases on scaling out the cluster. But it seems that after adding 6th node to the cluster, the throughput does not increase in proportion.