External versioning (external or external_gte)

Hi folks,

I'm currently working on a project which has a "projection" storage implemented using Elasticsearch. I have a few questions about external versioning.

First of all in general, in documentation it is said

Blockquote
external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data
Blockquote

Why exactly it is so? Can you provide more details?

In my case in primary storage let's say I keep a set of entities and every single of them has a version number. And now it is possible to perform two kind of operations:

  1. User make changes to a single entity which will internally cause version of entity be incremented. Then system schedules an operation which will create a document based on entity data and then index in Elasticsearch.

  2. Someone from support team would like to recreate/refresh projection storage based on data kept in primary storage using bulk request. In this case version of entity might not change.

So, I thought of using external/external_gt for a first case and external_gte for second scenario. Another option is to use external_gte version type for both scenarios but I'm wondering what is a reason not to use it?

Do you have any recommendation to this case?

Thanks,
Piotr

It seems that this question was asked (but not answered) some time ago as well

In our case we have database where every entity is versioned. Can we use this entity version number with external_gte?

It's seams that using external_gte would work for us because we can easily "Reindex" entire database when we suspect it got out of sync with ES index.

Do you think we miss something in our understanding?

Mirek

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.