CCR - Soft deletes limits and prevention of data loss because of premature merging

Hi everybody!

After digging more into the new CCR functionality and its capability of storing a history of events for possible folllowers I started to get curious on how high I could set the soft_delete setting on an index.
For my understanding, soft_deletes are gone after a merge happened. A merge, can occur quite quickly if there is more load on an index, which results in data loss (happened multiple times to me while testing).
Coming to my question, is there a rule of thumb for calculating a reasonable retention rate? And is there a way to prevent an index from merging too early?

Thanks in advance!

Hi @kley,

more information on this can be found in the following docs: https://www.elastic.co/guide/en/elastic-stack-overview/6.7/ccr-overview.html#_leader_index_retaining_operations_for_replication

Starting in 6.7, when a follower connects to the leader, it will create a shard history retention lease on the leader (which contains a timestamp as well as the minimum sequence number of operations to be retained), which ensures that the leader keeps all (soft-deleted) operations around in order for the follower to be able to catch up. This retention lease is valid for 12 hours and is constantly renewed as the follower is catching up to the leader.

For my understanding, soft_deletes are gone after a merge happened

Soft-deletes are only purged during merging if they are not explicitly retained by shard history retention leases.

This means that starting in 6.7, no retention rate needs to be calculated. The system ensures on itself that the leader keeps enough history for the follower to catch up.

Thanks for the quick response!

Interesting, why can I still set a retention rate when it's not needed? :smiley:
Anyway, driving the question a bit further to hopefully tackle down my problem.
Lately I run into an issue, that my follower index never replicates anything. It just creates the index and does not fetch any data from the leader. It doesn't matter if I reduce the load. Thats why i was thinking maybe a premature merge could have caused the problem. When I call the ccr-stats API for that index, it tells me, that it's not able to fetch the licence information. I explained that in more detail over here: CCR - Remove follow_stats and License fetching problem

The index.soft_deletes.retention.operations was the only way in 6.6 to retain history. While the setting still exists, its purpose has been replaced by the much more powerful shard history retention leases. Note that the CCR docs in 6.7 have also been significantly revised, and mention of the above setting been removed as it turned out to be quite difficult to properly configure.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.