Primary term and sequence number VS _version meta field

Consider that there are three primary shards in my index. For indexing and update requests, the primary shards are used. Elasticsearch can distribute incoming requests (such as massive bulk requests) to various primary shards if I have more than 1 primary shard in order to improve performance.

  1. Before the 6.x version, there existed a field called "_version" that kept track of how frequently a document was modified.

  2. I recently read about optimistic concurrency management techniques like the primary term and seq no, which were first introduced in 6.x. I understand that using the primary term and seq no is considerably more reliable than using the _version method since it enables us to figure out the exact sequence of index operations that took place on the primary.

However, why is the _version parameter still used and not deprecated? Which will be more effective in my use case for handling conflict #1 or #2?

While they can be used to track changes to a document, primary term and sequence numbers are used for internal operations around shard syncing/recovery. As such you cannot actually set those in a document, only read whatever those values are currently set at (and set some logic in those as you mention).

_version however is something that you can directly control and set, or let Elasticsearch manage for you.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.