Shards and replicas allocation in elasticsearch


#1

Hi,
i am new to Elasticsearch and still learning it . can i please how to calculate the number of shards and replicas required for an index ? i now that number factors should be considered but i am kind of confused after reading articles on the internet because some say to use 40% of data node capacity while others use the JVM heap size for calculation shards. For example, i have 9 node cluster with 10GB storage per node and i want index 50 GB of data and JVM size is 32GB. how many shards and replicas do i need ? what is the best practice ?Any detailed explanation would be really helpful .

Thank you


(Mark Walkom) #2

10GB, are you sure that is correct?


#3

yes.. each node has 10GB storage.


(Mark Walkom) #4

That's a huge heap size for a small amount of data. It's not really efficient.


#5

Thank you for the reply. So, we assign the shards based on the heap size on the node ? What am i trying to understand is that on what factors does assignment of shards and replicas depends?


(Mark Walkom) #6

Shard count per node, and disk space use.


#7

But in my case i am having a data set of 50GB. Can i please know how many shards to allocate per node and why ?

Thank you