Different primary and replica shard size

I am using ES 7.1.1.
I have one shard in each index and two replicas.
I am using bulkProcessor to insert data. We never update or delete docs.
Cluster: 3master, 3data, 2coord

Problem:
The size of data in primary shard is different from those of replicas. Document counts is same though.
We have observed that when such situation occurs, the query returns inconsistent results when we hit the coordinating nodes, multiple times.

Query:
We index data by keeping the replicas set to 2. Why does the replication does not create equal copy as the primary ? This is just a example of one index, we have about 100 such indexes, out of which this problem occurs couple of them everytime.

Supporting Data:

/_cat/shards
index-1 0 p STARTED 55592 149.8mb es-data-01
index-1 0 r STARTED 55592 149.6mb es-data-02
index-1 0 r STARTED 55592 149.8mb es-data-03

/_cat/segments

index shard prirep segment generation docs.count docs.deleted size size.memory committed searchable version compound
index-1 0 p _0 0 6839 0 19mb 21763 TRUE TRUE 8.0.0 TRUE
index-1 0 p _1 1 7166 0 19.1mb 21974 TRUE TRUE 8.0.0 TRUE
index-1 0 p _2 2 7794 0 19.8mb 21871 TRUE TRUE 8.0.0 TRUE
index-1 0 p _3 3 5138 0 13.5mb 18886 TRUE TRUE 8.0.0 TRUE
index-1 0 p _4 4 4559 0 12mb 18434 TRUE TRUE 8.0.0 TRUE
index-1 0 p _5 5 3940 0 11.2mb 17865 TRUE TRUE 8.0.0 TRUE
index-1 0 p _6 6 4447 0 12.6mb 18595 TRUE TRUE 8.0.0 TRUE
index-1 0 p _7 7 5687 0 15.7mb 20070 TRUE TRUE 8.0.0 TRUE
index-1 0 p _8 8 2387 0 6.2mb 16028 TRUE TRUE 8.0.0 TRUE
index-1 0 p _9 9 4396 0 11.8mb 18330 TRUE TRUE 8.0.0 TRUE
index-1 0 p _a 10 2907 0 7.6mb 16630 TRUE TRUE 8.0.0 TRUE
index-1 0 p _b 11 332 0 954kb 17472 TRUE TRUE 8.0.0 TRUE
index-1 0 r _0 0 9730 0 26.2mb 24541 TRUE TRUE 8.0.0 TRUE
index-1 0 r _1 1 6982 0 18.1mb 20708 TRUE TRUE 8.0.0 TRUE
index-1 0 r _2 2 11768 0 31.4mb 25719 TRUE TRUE 8.0.0 TRUE
index-1 0 r _3 3 5680 0 15.9mb 19605 TRUE TRUE 8.0.0 TRUE
index-1 0 r _4 4 5193 0 14.3mb 19072 TRUE TRUE 8.0.0 TRUE
index-1 0 r _5 5 3325 0 8.9mb 17078 TRUE TRUE 8.0.0 TRUE
index-1 0 r _6 6 3015 0 8mb 16779 TRUE TRUE 8.0.0 TRUE
index-1 0 r _7 7 1576 0 4.4mb 15220 TRUE TRUE 8.0.0 TRUE
index-1 0 r _8 8 3305 0 8.7mb 17065 TRUE TRUE 8.0.0 TRUE
index-1 0 r _9 9 1893 0 4.9mb 15619 TRUE TRUE 8.0.0 TRUE
index-1 0 r _a 10 1793 0 4.6mb 15476 TRUE TRUE 8.0.0 TRUE
index-1 0 r _b 11 1332 0 3.6mb 15018 TRUE TRUE 8.0.0 TRUE
index-1 0 r _0 0 6839 0 19mb 21763 TRUE TRUE 8.0.0 TRUE
index-1 0 r _1 1 7166 0 19.1mb 21974 TRUE TRUE 8.0.0 TRUE
index-1 0 r _2 2 7794 0 19.8mb 21871 TRUE TRUE 8.0.0 TRUE
index-1 0 r _3 3 5138 0 13.5mb 18886 TRUE TRUE 8.0.0 TRUE
index-1 0 r _4 4 4559 0 12mb 18434 TRUE TRUE 8.0.0 TRUE
index-1 0 r _5 5 3940 0 11.2mb 17865 TRUE TRUE 8.0.0 TRUE
index-1 0 r _6 6 4447 0 12.6mb 18595 TRUE TRUE 8.0.0 TRUE
index-1 0 r _7 7 5687 0 15.7mb 20070 TRUE TRUE 8.0.0 TRUE
index-1 0 r _8 8 2387 0 6.2mb 16028 TRUE TRUE 8.0.0 TRUE
index-1 0 r _9 9 4396 0 11.8mb 18330 TRUE TRUE 8.0.0 TRUE
index-1 0 r _a 10 2907 0 7.6mb 16630 TRUE TRUE 8.0.0 TRUE
index-1 0 r _b 11 332 0 954kb 17472 TRUE TRUE 8.0.0 TRUE

Any updates on this ? We have this as a major problem.

This is completely normal behaviour. Primaries and replicas contain the same documents but do not necessarily index them into exactly the same segments. Why is it a major problem?

Thanks for your reply David.

I had highlighted this as major problem, because:
We connect to the coordinating nodes with our client. When we fire a query, coordinating node sends it to each of data node in round robin fashion, our application receives inconsistent results.
Hit 1: Match
Hit 2: No Match
Hit 3: Match
Hit 4: No match
and on and on...

Above could be in any sequence :frowning:

Having different segments is completely Ok. But the results returned by each time the query hits ES should be same. Its more or less primary not in sync with its replicas.

That sounds strange. What is the query?

In the mean time, we figured out something.
We are using a termset query on a field which is having synonym filter.

We noticed that when we query one node using preferences, we always get a result.
Whereas, when we query to another node using preferences, we always do not get a result.

Our synonym file did change a while ago, but we did not to reindexing.
In past we had noticed that if we restart ES after updating synonym file, it did reflect changes in ES, we did not had to reindex.

Is not-indexing after synonym change, causing a problem with inconsistent results ?

A good thought. Yes, inconsistent synonyms files can indeed result in inconsistent search results. Elasticsearch will use the current synonym file for searches and for indexing new documents but will not re-index any old documents, and this can leave your index in a very strange state. I think reindexing would be wise.

Re-indexing has solved the issue related to synonym. I will keep a watch on this though.
Appreciate your response time. Thanks :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.