Update existing document

Frank_LaRosa · February 22, 2012, 12:45am

Hi,

I have some code that works as follows:

Query for a document using a non-analyzed field (so I should only
get a result on an exact match)
If the query returned something, I update a counter in the document
and re-index it (using prepareIndex with the id value from the hit)
If the query did not return anything, then I create a new document
(using prepareIndex without an id)

My expectation is that this code should not produce any documents that
are duplicates (that have the same value in the non-analyzed field
used in the original search).

However, that's not the case. I often find several different versions
of each document.

How is this possible and what can I do to prevent it?

lukeforehand_2 · February 22, 2012, 3:15am

Are you indexing while searching, or do you have multiple ClientS
working at the same time that might cause an inconsistent state of
your index?

-Luke

On Feb 21, 6:45 pm, Frank LaRosa fr...@studyblue.com wrote:

Hi,

I have some code that works as follows:

Query for a document using a non-analyzed field (so I should only
get a result on an exact match)

If the query returned something, I update a counter in the document
and re-index it (using prepareIndex with the id value from the hit)

If the query did not return anything, then I create a new document
(using prepareIndex without an id)

My expectation is that this code should not produce any documents that
are duplicates (that have the same value in the non-analyzed field
used in the original search).

However, that's not the case. I often find several different versions
of each document.

How is this possible and what can I do to prevent it?

Clinton_Gormley · February 22, 2012, 8:54am

On Tue, 2012-02-21 at 16:45 -0800, Frank LaRosa wrote:

Hi,

I have some code that works as follows:

Query for a document using a non-analyzed field (so I should only
get a result on an exact match)

If the query returned something, I update a counter in the document
and re-index it (using prepareIndex with the id value from the hit)

If the query did not return anything, then I create a new document
(using prepareIndex without an id)

Because all of this is happening in parallel, you may well get two
processes checking for the same (missing) value at the same time, and
both of them end up creating new docs.

The only way to emulate a unique key in ES is by using the doc ID.

clint

Topic		Replies	Views
Indexing same document twice Elasticsearch	5	10261	July 5, 2017
Different results on same query once a document is updated Elasticsearch	2	335	January 27, 2021
Return unique documents in the hit response using an aggregation Elasticsearch	1	312	April 29, 2019
Upsert without partial update Elasticsearch	1	447	January 15, 2020
Dealing with duplicate documents Elasticsearch	4	1420	July 5, 2017

Update existing document

Related topics