Is my elasticsearch index data duplicated to all nodes?


#1

Hi,

I am using ELK GA 6.3.0. I have set up a 5 node cluster of all master eligible nodes. I have the below data from rest request;

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myindex UmygQyrBRnqAmImsdT1wOA   5   1      10059            0       17mb          8.5mb

My question is, i have noticed pri = 5. Is my same data getting duplicated to all those 5 nodes? What I want is, If i have 100 entries in my index, every node may hold 20+20 = 40 entries (20 primary + 20 replica). How to do this?

Thanks.


(Mark Walkom) #2

pri = primary shards.
rep = replica copies.

So you have 5 primary shards and 1 copy - ie 5 replica shards. Giving you 10 total shards for that index.

If you want all nodes to have a copy of the data then you will want 4 replicas.


#3

Hi @warkolm so currently, if my index has 100 entries, each node has 20+20 = 40 entries (20 primary + 20 replica)? I am concerned with my disk space, thats why. My expectation for a single entry is 1 primary in one node + 1 replica in another node.


(Mark Walkom) #4

The sharding algorithm is not that precise, so you may not end up with those exact numbers. Overall though it will balance out.


#5

Thanks @warkolm for your kind support :slightly_smiling_face: :+1: