Question about Segment described in "Elasticsearch: The Definitive Guide"

The gray segment does not represent a previous commit as previously suggested. It is not included in the commit because a commit operation has not been run since the segment has been flushed to disk. In the previous diagram there are three segments which have been flushed to disk AND committed. when indexing continues the in memory buffer will be filled up with new documents which at some point (either through an explicit or implicit operation) is flushed to disk creating a new segment (the grey segment). This segment is now searchable but is NOT committed yet as the segment as not been f-synced to disk so if the node was to fail now the segment files for the grey segment would not have been persisted and would not survive after a restart. Once the commit operation is called the grey segment will be f-synced, and a commit point will be created which includes this new grey segment. So the commit point will never include uncommited (or non-fsynced segments)

On a re-refresh the IndexReader used to read the segments is updated to include the new uncommitted segment (this is why the segment is referred to in the documentation as searchable but uncommitted). If the node was to fail and restart that segment would be lost since it is uncommitted, but the data from that segment would be replayed from the translog as part of the startup sequence so the data was not lost from Elasticsearch entirely.

2 Likes