Replica Shards Much Larger Than Primaries

DavidTurner · April 7, 2018, 8:38pm

I see two (possibly related) things in this excerpt from the shard-level stats. The inflated shards have significantly higher flush counts, and a local checkpoint that lags the maximum sequence number by a long way. The lagging local checkpoint means Elasticsearch keeps hold of a lot of translog, and indeed the translog.uncommitted_size_in_bytes statistics correlate well with the difference between the sizes of primary and replica. The total size of the translog is much larger on the problematic shards too, on both primary and replica, because the lagging local checkpoint prevents the global checkpoint from advancing, which in turn prevents old parts of the translog from being cleaned up, and I think this goes some way to explaining the imbalance in sizes between the different primaries, particularly since their document counts are so close.

It is possible that this is fixed by #29125: the high flush counts point in that direction, but I'm not sure that explains the lagging local checkpoint.

I am not going to be available to look in more depth at this for the next week or so but will raise it with the team. In the meantime, it looks like rebuilding the replicas should get everything back on track.

The excerpt from the stats was obtained as follows:

$ cat 20180407_index_shards_stats.json | jq '.indices["applicationlogs-2018.04.07"].shards | map_values (map({primary:.routing.primary,translog,seq_no,flush}))'

Topic		Replies	Views
Strange size difference in shards since upgrading to 6.2 Elasticsearch	4	565	May 4, 2018
Replica shards bigger than primary on some indexes Elasticsearch	3	1332	July 5, 2017
Primary Shards MUCH larger than Replicas Elasticsearch	1	424	August 30, 2018
Issue with shard replicas being different in size Elasticsearch	5	348	July 6, 2017
Shard size is different between primary and replica Elasticsearch	3	1612	October 26, 2018

Replica Shards Much Larger Than Primaries

Related topics