Is my elasticsearch index data duplicated to all nodes?

elasticheart · November 23, 2018, 9:06am

Hi,

I am using ELK GA 6.3.0. I have set up a 5 node cluster of all master eligible nodes. I have the below data from rest request;

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myindex UmygQyrBRnqAmImsdT1wOA   5   1      10059            0       17mb          8.5mb

My question is, i have noticed pri = 5. Is my same data getting duplicated to all those 5 nodes? What I want is, If i have 100 entries in my index, every node may hold 20+20 = 40 entries (20 primary + 20 replica). How to do this?

Thanks.

warkolm · November 23, 2018, 9:10am

pri = primary shards.
rep = replica copies.

So you have 5 primary shards and 1 copy - ie 5 replica shards. Giving you 10 total shards for that index.

If you want all nodes to have a copy of the data then you will want 4 replicas.

elasticheart · November 23, 2018, 9:15am

Hi @warkolm so currently, if my index has 100 entries, each node has 20+20 = 40 entries (20 primary + 20 replica)? I am concerned with my disk space, thats why. My expectation for a single entry is 1 primary in one node + 1 replica in another node.

warkolm · November 23, 2018, 9:17am

The sharding algorithm is not that precise, so you may not end up with those exact numbers. Overall though it will balance out.

elasticheart · November 23, 2018, 9:43am

Thanks @warkolm for your kind support

system · December 21, 2018, 9:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.