applicationlogs-2018.04.04 2 p STARTED 21977491 29.5gb 10.187.13.6 elasticsearch-data-hot-3
applicationlogs-2018.04.04 2 r STARTED 21985312 55.3gb 10.187.8.6 elasticsearch-data-hot-4
When this has been seen, the replicas continue to grow at a faster rate than their primaries which results in bloated disks.
Link to the stats for an example index: https://gist.github.com/Evesy/fab61e87cd277c56e8d5bfa62a3b3feb Doc counts look about right, not sure if there's anything else in there that points towards the issue. This is happening on more than one index -- All the indices are ingest only (this example being ~2k/sec), nothing is ever deleted.
When we've been rebuilding the replicas it does appear to come back at roughly the same size as the primary shards -- When we've been rebuilding it's been after the index has stopped ingesting and a force merge is performed shortly after so it's difficult to say for sure whether the issue would come back.
I'll try get the shard stats output for you when I'm back online, however since previous indices have been merged down to 1 segment per shard I'm guessing the output of those will be unhelpful? -- I'll try and gather stats from a newer index that's currently experiencing the issue if so
Yes, if you've cleaned up the problem on that index and the issue recurs on newer indices then please grab the stats from one exhibiting the problem instead.
I've collected some new stats for an index that's showing the problem:
applicationlogs-2018.04.07 5 r STARTED 20439597 14.7gb 10.187.9.25 elasticsearch-data-hot-1
applicationlogs-2018.04.07 5 p STARTED 20441468 14.5gb 10.187.21.7 elasticsearch-data-hot-5
applicationlogs-2018.04.07 2 p STARTED 20442814 31.1gb 10.187.9.25 elasticsearch-data-hot-1
applicationlogs-2018.04.07 2 r STARTED 20444535 64.5gb 10.187.11.27 elasticsearch-data-hot-2
applicationlogs-2018.04.07 4 p STARTED 20437839 14.5gb 10.187.13.19 elasticsearch-data-hot-3
applicationlogs-2018.04.07 4 r STARTED 20441808 14.2gb 10.187.21.7 elasticsearch-data-hot-5
applicationlogs-2018.04.07 3 p STARTED 20436107 24.5gb 10.187.11.27 elasticsearch-data-hot-2
applicationlogs-2018.04.07 3 r STARTED 20440543 57.3gb 10.187.13.19 elasticsearch-data-hot-3
applicationlogs-2018.04.07 1 p STARTED 20440523 26gb 10.187.10.25 elasticsearch-data-hot-0
applicationlogs-2018.04.07 1 r STARTED 20444270 60.5gb 10.187.8.21 elasticsearch-data-hot-4
applicationlogs-2018.04.07 0 r STARTED 20433938 14.6gb 10.187.10.25 elasticsearch-data-hot-0
applicationlogs-2018.04.07 0 p STARTED 20433435 13.6gb 10.187.8.21 elasticsearch-data-hot-4
I see two (possibly related) things in this excerpt from the shard-level stats. The inflated shards have significantly higher flush counts, and a local checkpoint that lags the maximum sequence number by a long way. The lagging local checkpoint means Elasticsearch keeps hold of a lot of translog, and indeed the translog.uncommitted_size_in_bytes statistics correlate well with the difference between the sizes of primary and replica. The total size of the translog is much larger on the problematic shards too, on both primary and replica, because the lagging local checkpoint prevents the global checkpoint from advancing, which in turn prevents old parts of the translog from being cleaned up, and I think this goes some way to explaining the imbalance in sizes between the different primaries, particularly since their document counts are so close.
It is possible that this is fixed by #29125: the high flush counts point in that direction, but I'm not sure that explains the lagging local checkpoint.
I am not going to be available to look in more depth at this for the next week or so but will raise it with the team. In the meantime, it looks like rebuilding the replicas should get everything back on track.
The excerpt from the stats was obtained as follows:
Interestingly it was due to #28350 that we upgraded to from 6.1.2, as we believed we were suffering from the symptoms of that described in another post somewhere on this forum.
I can confirm rebuilding replicas is a suitable workaround for the time being
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.