Disk Space Utilization on Single Node

shivkumar · March 30, 2017, 12:50pm

Hello ,

I have configured Elasticsearch cluster.
I am using version 2.4.4.
My environment contains one Master Node and two Data Nodes.
I am loading data to my Elasticsearch using Logstash .

Please refer the following configuration which I made in my Logstash.

elasticsearch {
            hosts => ["http://10.0.1.177:9200/","http://10.0.1.239:9200/"]
            index => "%{[@index_name][es_index]}_%{+YYYY_M}"
            document_type => "%{[@index_type][es_document_type]}"
        }

What I can see is my Disk Space Utilization is always high on a single data node.
I was expecting it to be same on the both of the data nodes, I have configured number_of_replicas as value 1.
I am not able to understand this behavior of my Cluster.

Please refer the following image which shows my Disk Space Utilization.

Color Codes :

Orange : Master Node
Green : Data Node 1
Blue : Data Node 2

You can see that the Data Node 1 (Green Line) is so high than others.

Could any body of you please help me to understand this behavior , so that I correctly configure my environment.

Another things which I am trying to understand is :

Whether the data is stored on Master Node also ? If I provide IP of Master Node.
How much it will be beneficial to configure a Client Node ?

Thank you

spinscale · March 31, 2017, 7:11am

Hey,

a master only node never stores data. You need to find out (take a look at the cat APIs, especially the shards one), which data is on which node and check if the data is distributed evenly or if you have one big shard that eats up all the data - or if Elasticsearch is not the system eating up all your space, you cant tell by the information provided.

--Alex

Christian_Dahlqvist · March 31, 2017, 7:24am

Make sure all nodes in the cluster are running exactly the same version of Elasticsearch. If you have different versions, shards that get created on a node with a newer version can not be replicated over to a node with a lower version. This can lead to the type of imbalances you are seeing.

system · April 28, 2017, 7:24am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nodes Space balancing Issue Elasticsearch	8	364	September 24, 2020
Replication of cluster data Elasticsearch	2	271	May 20, 2019
Elasticsearch Cluster - Difference is storage usage between nodes Elasticsearch	8	1721	January 5, 2021
Single node consuming more disk space Elastic Cloud on Kubernetes (ECK)	7	706	October 27, 2021
Is it possible to use same data storage for nodes? Elasticsearch	2	2555	December 20, 2018

Disk Space Utilization on Single Node

Related topics