GlobalCheckpoint syncer only syncs translog when the translog durability is REQUEST

Elasticsearch version: 7.10
Currently, IndexShard.maybeSyncGlobalCheckpoint is called in two places:

  1. In the AsyncGlobalCheckpointTask of IndexService;
  2. In the TransportReplicationAction after a write action.
    In IndexShard.maybeSyncGlobalCheckpoint, it runs the globalCheckpointSyncer according to the following conditions:
    // only sync if there are no operations in flight, or when using async durability
    final SeqNoStats stats = getEngine().getSeqNoStats(replicationTracker.getGlobalCheckpoint());
    final boolean asyncDurability = indexSettings().getTranslogDurability() == Translog.Durability.ASYNC;
        if (stats.getMaxSeqNo() == stats.getGlobalCheckpoint() || asyncDurability) {
            final ObjectLongMap<String> globalCheckpoints = getInSyncGlobalCheckpoints();
            final long globalCheckpoint = replicationTracker.getGlobalCheckpoint();
            // async durability means that the local checkpoint might lag (as it is only advanced on fsync)
            // periodically ask for the newest local checkpoint by syncing the global checkpoint, so that ultimately the global
            // checkpoint can be synced. Also take into account that a shard might be pending sync, which means that it isn't
            // in the in-sync set just yet but might be blocked on waiting for its persisted local checkpoint to catch up to
            // the global checkpoint.
            final boolean syncNeeded =
                (asyncDurability && (stats.getGlobalCheckpoint() < stats.getMaxSeqNo() || replicationTracker.pendingInSync()))
                    // check if the persisted global checkpoint
                    || StreamSupport
                            .stream(globalCheckpoints.values().spliterator(), false)
                            .anyMatch(v -> v.value < globalCheckpoint);
            // only sync if index is not closed and there is a shard lagging the primary
            if (syncNeeded && indexSettings.getIndexMetadata().getState() == IndexMetadata.State.OPEN) {
                logger.trace("syncing global checkpoint for [{}]", reason);
                globalCheckpointSyncer.run();
            }
        }

One of the condition checks the translog durability, which should be ASYNC, and if the local checkpoint lags, it runs the GlobalCheckpointSyncer, which will then execute GlobalCheckpointSyncAction. This action syncs the translog of the given indexShard when the translog durability is REQUEST.

    private void maybeSyncTranslog(final IndexShard indexShard) throws IOException {
        if (indexShard.getTranslogDurability() == Translog.Durability.REQUEST &&
            indexShard.getLastSyncedGlobalCheckpoint() < indexShard.getLastKnownGlobalCheckpoint()) {
            indexShard.sync();
        }
    }

This condition and the action behavior conflicts. Should we remove the translog durability check int GlobalCheckpointSyncAction?

I don't think so, we don't want to actually sync the translog here if the durability is ASYNC. In that case we just want to run a no-op GlobalCheckpointSyncAction to ensure that the primary's ReplicationTracker is kept up to date.

3 Likes

Indeed as you say. By running a no-op GlobalCheckpointSyncAction , the replicas learn about current globalCheckpoint from primary and the primary collects the localCheckpoints from replicas, which may result in globalCheckpoint advance and checkpoints update in ReplicationTracker .

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.