Elasticsearch Cluster - Difference is storage usage between nodes


Im having 3 node cluster with / partition has 32 GB. ES data folder is /var/lib/elasticsearch which is below / partition. Before I run load testing, I had 8 Million records and "used storage" details is below

node 1 - 40% of / is used
node 2 - 40% of / is used
node 3 - 25% of / is used

Then I ran load testing, after that it pumped 20 Million records taking total record count to 28 Million. After that storage

node 1 - 71% of / is used
node 2 - 72% of / is used
node 3 - 28% of / is used

As you see, node 1 & node 2 has jump in storage usage after load testing. But node-3 doesnt have much change in storage. I assumed all three nodes will have same number of records physically? I dont see any issue in data count when I run n Dev Tools. But storage used in not even in all 3 nodes

Is it expected?

What is the output of the cat shards API? Elasticsearch stores data in shards and this is the unit used for data distribution. If you have only one primary and one replica shard per index, that index data will only be stored on two nodes as the total number of shards is 2. If you have one index that is much larger than the others and it has only two shards seeing this kind of imbalance is not surprising.

yeah I guess.

Im creating index as below

curl -s -X PUT $elasticsearchURL/$indexName?pretty -H 'Content-Type: application/json' -d '{"settings":{"index":{"number_of_shards" : 1,"number_of_replicas" : 1}}}'

So will this have any impact in functionality??

If you want the cluster to be able to distribute data evenly you will likely need to set number_of_shards to 3.

Ok. Will it have performance impact or data loss in any scenario?

Given the size of your cluster I do not think that is likely.

We recently added docs that answer exactly this question - see the section that starts

TIP: It is normal for the nodes in your cluster to be using very different amounts of disk space....

1 Like

Thank you for quick turnaround. Can I increase number of shards in existing cluster from 1 to 3?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.