VersionConflictEngineException in 0.19.1 when updating via mapper attachments plugin

Shane_Witbeck · March 26, 2012, 8:21pm

I'm using a couple of workers to update existing documents with one or more
attachment/files using a worker pattern. 99% of the time this works but
once in a while I'm getting the following exception:

gist.github.com

https://gist.github.com/digitalsanctum/2209367

gistfile1.txt

Caused by: org.elasticsearch.index.engine.VersionConflictEngineException: [threads][0] [thread][1311572]: version conflict, current [4], provided [3]
	at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:525)
	at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:479)
	at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:323)
	at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:206)
	at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:529)
	at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.start(TransportShardReplicationOperationAction.java:431)
	at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.start(TransportShardReplicationOperationAction.java:338)
	at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction.doExecute(TransportShardReplicationOperationAction.java:104)
	at org.elasticsearch.action.index.TransportIndexAction.innerExecute(TransportIndexAction.java:135)

This file has been truncated. show original

The update code is here:

gist.github.com

https://gist.github.com/digitalsanctum/2209383

gistfile1.java

Map<String, Object> attachment = new HashMap<String, Object>();
        attachment.put("content", attachmentContent);
        attachment.put("_indexed_chars", -1); // no limit

        fileProperties.put("file", attachment);
        fileProperties.put("postID", postID);
        fileProperties.put("filename", file.getFilename());
        fileProperties.put("path", file.getPath());
        fileProperties.put("fileSize", file.getFileSize());

This file has been truncated. show original

Any ideas why the exception would occur?

kimchy · March 27, 2012, 1:22pm

The way update works is by reading the document from the shard, and then
indexing it (using the version it was read with). There might be a version
conflict happening if two updates on the same doc happen at the same time,
and then interleave. You can set the retryOnConflict to a higher value to
automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck shane@digitalsanctum.comwrote:

I'm using a couple of workers to update existing documents with one or
more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

ES 0.19.1 VersionConflictEngineException · GitHub

The update code is here:

ES update · GitHub

Any ideas why the exception would occur?

Shane_Witbeck · March 27, 2012, 1:39pm

That makes sense if I have two or more processes updating the same doc. Is
there a way to handle this more gracefully other than increasing the
retryOnConflict? Otherwise, I'll have to synchronize my processes to only
do one update at a time for a document.

Thanks,
Shane

On Tuesday, March 27, 2012 9:22:33 AM UTC-4, kimchy wrote:

The way update works is by reading the document from the shard, and then
indexing it (using the version it was read with). There might be a version
conflict happening if two updates on the same doc happen at the same time,
and then interleave. You can set the retryOnConflict to a higher value to
automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck shane@digitalsanctum.comwrote:

I'm using a couple of workers to update existing documents with one or
more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

ES 0.19.1 VersionConflictEngineException · GitHub

The update code is here:

ES update · GitHub

Any ideas why the exception would occur?

kimchy · March 27, 2012, 2:30pm

What do you mean by handling it more gracefully? What do you have in mind?
The main idea here is not to block the indexing process while the update is
happening. You can have a high retry on conflict value...

On Tue, Mar 27, 2012 at 3:39 PM, Shane Witbeck shane@digitalsanctum.comwrote:

That makes sense if I have two or more processes updating the same doc. Is
there a way to handle this more gracefully other than increasing the
retryOnConflict? Otherwise, I'll have to synchronize my processes to only
do one update at a time for a document.

Thanks,
Shane

On Tuesday, March 27, 2012 9:22:33 AM UTC-4, kimchy wrote:

The way update works is by reading the document from the shard, and then
indexing it (using the version it was read with). There might be a version
conflict happening if two updates on the same doc happen at the same time,
and then interleave. You can set the retryOnConflict to a higher value to
automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck <shane@digitalsanctum.com

wrote:

I'm using a couple of workers to update existing documents with one or
more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

https://gist.github.com/**2209367 https://gist.github.com/2209367

The update code is here:

https://gist.github.com/**2209383 https://gist.github.com/2209383

Any ideas why the exception would occur?

Shane_Witbeck · March 27, 2012, 2:45pm

For background, I have documents that are currently indexed in two phases.

First phase is fairly quick and indexes all basic data.
Second phase is potentially longer running since it's using the
attachment-mapper plugin to update the document with possibly several
attachments ranging in size from a few KB to several MB's.

The idea here to get all documents indexed as quickly as possible with
basic data then have several workers index the related attachments. While I
understand your design decision to not block the indexing process, I think
this should be mentioned in the documentation somewhere. I'll try your
suggestion of increasing the retryOnConflict but it seems better in my case
to manually synchronize the second phase to minimize indexing time and
resources. Would you agree this is the better approach?

Thanks,

Shane

On Tuesday, March 27, 2012 10:30:51 AM UTC-4, kimchy wrote:

What do you mean by handling it more gracefully? What do you have in mind?
The main idea here is not to block the indexing process while the update is
happening. You can have a high retry on conflict value...

On Tue, Mar 27, 2012 at 3:39 PM, Shane Witbeck shane@digitalsanctum.comwrote:

That makes sense if I have two or more processes updating the same doc.
Is there a way to handle this more gracefully other than increasing the
retryOnConflict? Otherwise, I'll have to synchronize my processes to only
do one update at a time for a document.

Thanks,
Shane

On Tuesday, March 27, 2012 9:22:33 AM UTC-4, kimchy wrote:

The way update works is by reading the document from the shard, and then
indexing it (using the version it was read with). There might be a version
conflict happening if two updates on the same doc happen at the same time,
and then interleave. You can set the retryOnConflict to a higher value to
automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck <
shane@digitalsanctum.com> wrote:

I'm using a couple of workers to update existing documents with one or
more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

https://gist.github.com/**2209367 https://gist.github.com/2209367

The update code is here:

https://gist.github.com/**2209383 https://gist.github.com/2209383

Any ideas why the exception would occur?

kimchy · March 27, 2012, 6:50pm

Not really sure I understand..., are you indexing a doc, and then adding
attachments to it using update API? Why not index the whole doc with all
the attachments at once?

On Tue, Mar 27, 2012 at 4:45 PM, Shane Witbeck shane@digitalsanctum.comwrote:

For background, I have documents that are currently indexed in two phases.

First phase is fairly quick and indexes all basic data.

Second phase is potentially longer running since it's using the
attachment-mapper plugin to update the document with possibly several
attachments ranging in size from a few KB to several MB's.

The idea here to get all documents indexed as quickly as possible with
basic data then have several workers index the related attachments. While I
understand your design decision to not block the indexing process, I think
this should be mentioned in the documentation somewhere. I'll try your
suggestion of increasing the retryOnConflict but it seems better in my case
to manually synchronize the second phase to minimize indexing time and
resources. Would you agree this is the better approach?

Thanks,

Shane

On Tuesday, March 27, 2012 10:30:51 AM UTC-4, kimchy wrote:

What do you mean by handling it more gracefully? What do you have in
mind? The main idea here is not to block the indexing process while the
update is happening. You can have a high retry on conflict value...

On Tue, Mar 27, 2012 at 3:39 PM, Shane Witbeck shane@digitalsanctum.comwrote:

That makes sense if I have two or more processes updating the same doc.
Is there a way to handle this more gracefully other than increasing the
retryOnConflict? Otherwise, I'll have to synchronize my processes to only
do one update at a time for a document.

Thanks,
Shane

On Tuesday, March 27, 2012 9:22:33 AM UTC-4, kimchy wrote:

The way update works is by reading the document from the shard, and
then indexing it (using the version it was read with). There might be a
version conflict happening if two updates on the same doc happen at the
same time, and then interleave. You can set the retryOnConflict to a higher
value to automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck <
shane@digitalsanctum.com> wrote:

I'm using a couple of workers to update existing documents with one or
more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

https://gist.github.com/**220936**7 https://gist.github.com/2209367

The update code is here:

https://gist.github.com/**220938**3 https://gist.github.com/2209383

Any ideas why the exception would occur?

Shane_Witbeck · March 27, 2012, 7:00pm

Correct. I'm indexing a doc, then adding attachments via update API. I'm
doing a two-phase approach to get the basic information indexed more
quickly instead of waiting for all associated attachments to get indexed up
front. This way I'm able to index over 1M documents with all data other
than attachments in ~20 mins instead of waiting for ~200K attachments to
get indexed which takes much longer.

Anyway, I increased the retryOnConflict to 20 and the
VersionConflictEngineException went away.

Thanks,
Shane

On Tuesday, March 27, 2012 2:50:36 PM UTC-4, kimchy wrote:

Not really sure I understand..., are you indexing a doc, and then adding
attachments to it using update API? Why not index the whole doc with all
the attachments at once?

On Tue, Mar 27, 2012 at 4:45 PM, Shane Witbeck shane@digitalsanctum.comwrote:

For background, I have documents that are currently indexed in two
phases.

First phase is fairly quick and indexes all basic data.

Second phase is potentially longer running since it's using the
attachment-mapper plugin to update the document with possibly several
attachments ranging in size from a few KB to several MB's.

The idea here to get all documents indexed as quickly as possible with
basic data then have several workers index the related attachments. While I
understand your design decision to not block the indexing process, I think
this should be mentioned in the documentation somewhere. I'll try your
suggestion of increasing the retryOnConflict but it seems better in my case
to manually synchronize the second phase to minimize indexing time and
resources. Would you agree this is the better approach?

Thanks,

Shane

On Tuesday, March 27, 2012 10:30:51 AM UTC-4, kimchy wrote:

What do you mean by handling it more gracefully? What do you have in
mind? The main idea here is not to block the indexing process while the
update is happening. You can have a high retry on conflict value...

On Tue, Mar 27, 2012 at 3:39 PM, Shane Witbeck <shane@digitalsanctum.com

wrote:

That makes sense if I have two or more processes updating the same doc.
Is there a way to handle this more gracefully other than increasing the
retryOnConflict? Otherwise, I'll have to synchronize my processes to only
do one update at a time for a document.

Thanks,
Shane

On Tuesday, March 27, 2012 9:22:33 AM UTC-4, kimchy wrote:

The way update works is by reading the document from the shard, and
then indexing it (using the version it was read with). There might be a
version conflict happening if two updates on the same doc happen at the
same time, and then interleave. You can set the retryOnConflict to a higher
value to automatically retry the update if it happens.

On Mon, Mar 26, 2012 at 10:21 PM, Shane Witbeck <
shane@digitalsanctum.com> wrote:

I'm using a couple of workers to update existing documents with one
or more attachment/files using a worker pattern. 99% of the time this works
but once in a while I'm getting the following exception:

https://gist.github.com/**220936**7 https://gist.github.com/2209367

The update code is here:

https://gist.github.com/**220938**3 https://gist.github.com/2209383

Any ideas why the exception would occur?

Topic		Replies	Views
VersionConflictEngineException throw after the upgrade From 5.0.0 to 5.6.16 Elasticsearch	3	331	November 23, 2021
Watcher VersionConflictEngineException Elasticsearch	1	958	April 17, 2017
ES VersionConflictEngineException current version [-1] Elasticsearch	2	294	September 1, 2022
VersionConflictException missing Elasticsearch	3	472	July 5, 2017
Getting VersionConflictEngineException exception? Elasticsearch	2	1184	July 6, 2017

VersionConflictEngineException in 0.19.1 when updating via mapper attachments plugin

Related topics