We've recently upgraded our stand-by cluster to ES 6.7.1 from ES 5.6. Since our upgrade we're seeing a situation where some updates to our documents are not being applied correctly.
Some additional details on our use case.
- We have a system that asynchronously updates ES documents in our cluster.
- We have two different modules that update documents. One is adding fields, the other is removing fields.
- When adding/updating fields we are using the Update API and are setting the
doc_as_upsertflag = true and
- When removing fields we are doing scripted updates to remove the fields via the UPDATE API. We're also setting
retry_on_conflict= 3 in this case.
We have a portion of our system that kicks off both an update and a delete to some documents. We are seeing cases where some documents only have one of the two operations performed successfully. We are seeing cases where only the delete operation is performed successfully but are also seeing cases where only the update operation performed successfully.
It's important to note that we're using the ElasticSearch Python library to in our code that updates ES.
We didn't run into this issue in 5.6 as the retry_on_conflict setting seemed to ensure our operations completed. We're not seeing the same behavior in ES 6.7.1.
We understand that ES changed how document status is tracked and added the Optimistic Concurrency control. We're concerned that document versioning and update conflict checking no longer works as we expect from 5.6.
We're currently doing further debugging and research. We're trying to enable the TRACE logs to did deeper in to what's going on.
Is anyone else running into this? Any help you can provide is appreciated.
We'll post updates as we find more details.