How to index only the newest version of the document

I'm trying to figure out if what I'm trying to do is even possible, or if I've gone totally off track.

We have a series of timers we collect throughout a transaction lifecycle in a separate system, and we're trying to ensure we store them without losing any data, even if the transaction dies before completion. Towards this end we're looking at dropping an object onto a queue every time we update one of the timers. This way the objects will go off to the processor services and be indexed into Elasticsearch, and if the transaction dies we'll have the latest state of the timers recorded.

With that in mind, we need to avoid the possibility of overriding a newer version of the timers with an older version in case they come off the queue, are processed or are indexed in the wrong order (because there's a lot of potential failure points in the chain where that could happen). So we were looking at the Update By Query functionality as a potential path to this, with one of the query parameters being that a timestamp in the document is greater than the latest indexed version. The issue we've encountered is that we can't seem to find a way to replace an entire document using this functionality - from what I can see you need to script up a key-by-key update, and with at least one of the things being updated being an arbitrarily long array of dictionaries that might be prohibitive.

Am I completely off the mark on how to do this? Am I missing something in the documentation that will allow me to do this simply? Any help that I can get would be most appreciated.

Hi,

As I understand your problem you can solve it using version.

More details in this link below, check the parts untitled "Already have a versioning system in place?" so you can use the timestamp you have or something like:

https://www.elastic.co/blog/elasticsearch-versioning-support

Hope it help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.