Retry_on_conflict when using partial update

chuckoh · March 16, 2018, 3:20am

PUT /intro_limits
{
  "settings": {
    "index": {
      "number_of_shards": 1
    },
    "auto_expand_replicas": "0-all"
  },
  "mappings": {
    "doc": {
      "properties": {
        "id": { "type": "long" },
        "list": { "type": "long" }
      }
    }
  }
}

I need to partial-update the index above in a concurrent manner.

The list contains array of long type data (maintains up to 1000s of them) and, a new long type data will be appended into it frequently.

I observed that the partial update has occasionally failed with a status code of 409.

So, I ended up setting retry_on_conflict=5 like this;

public function setIntroLimitAddParams(int $id, int $id_to_be_added)
{
    return [
        'id' => $id,
        'index' => 'intro_limits',
        'type' => 'doc',
        'retry_on_conflict' => 5,
        'body' => [
            'script' => [
                'source' =><<<EOT
if (ctx._source.list.contains(params.x)) {ctx.op = 'noop'}
ctx._source.list.add(params.x)
EOT
                ,
                'lang' => 'painless',
                'params' => [
                    'x' => $id_to_be_added
                ]
            ],
            'upsert' => [
                'id' => $id,
                'list' => [ $id_to_be_added ]
            ]
        ]
    ];
}

Ever since I set the retry_on_conflict parameter, 409 error has been completely gone. But, within my use case, the integrity of data entries is very important. No missing data entry is allowed.

My question here is that letting Elasticsearch resolute the conflict with the retry_on_conflict parameter guarantees "I will never experience any chances of data loss?"

I've just experienced that one of list data is NOT appended while I was testing my application... even though I get all 200 status (no 409 status) responses from Elasticsearch. What makes me frustrate is that now I can't reproduce the case again.

So, I'm wondering if it's sensible to presume that using retry_on_conflict parameter is appropriate in my use case. Or, instead of using retry_on_conflict parameter, I should deal with the 409 error manually by sending requests until I get 200 on application level.

Any advice will greatly be appreciated. Btw, I'm using Elasticsearch 6.0.1

Thanks.

Chuck.

system · April 13, 2018, 3:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Does "retry_on_conflict" will handle updating the latest version number before retrying again? Elasticsearch	2	1606	September 17, 2021
UpdateRequest upsert with retryOnConflict using BulkProcessor failed Elasticsearch	8	3544	July 17, 2019
What's appropriate value at "retry on conflict"? Elasticsearch	5	14019	July 5, 2017
What is the name of a parameter "retry_on_conflict" in Bulk Update API Elasticsearch	5	4934	December 5, 2016
Why is retry_on_conflict necessary? Elasticsearch	2	3283	September 1, 2022

Retry_on_conflict when using partial update

Related topics