We are seeing the situation where when we restart ES, there is a long recovery time for the shards. The service is down usually for only about 90 seconds, but the recovery can be an hour or so. During my investigation, I am looking at the [index]/_stats?level=shards api. I see this:
I just want to understand what this is telling me so I can figure out if it is normal or not.
We are using the defaults for translog (v 6.8.8), so I would think the translog is getting flushed with each successful operation. If I look at the above, it is telling me that there is ~21 GB of uncommitted data. If it is getting flushed continuously, shouldn't this be much lower?
The index has 80 primary shards (with 2 replicas). Each shard is ~35-45 GB, and a total index size of 2.9 TB and 5.3 billion docs.
The function of translog is to ensure data security and replay the operation(without falling disk) when there is a problem, flush operation will clear the translog
I have a rough idea of what translog is and why it is used. My question is around interpreting the response from the _stats api. If we are not setting anything for index.translog.durability, I am assuming we are set for "request:"
request
(default) fsync and commit after every request. In the event of hardware failure, all acknowledged writes will already have been committed to disk.
Should I expect therefore that the uncommitted_operations is fairly low (as the operation should get flushed on completion)? I am just not certain if having a constant ~20 gb of uncommitted_size_in_bytes is normal, or a sign of an issue.
Again, my goal is to track down why reinitialization of a shard takes an hour when the node was only offline for 90 seconds. If it was incorrectly replaying an inflated translog, that could be the reason.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.