Retry_on_conflict when using partial update

PUT /intro_limits
{
  "settings": {
    "index": {
      "number_of_shards": 1
    },
    "auto_expand_replicas": "0-all"
  },
  "mappings": {
    "doc": {
      "properties": {
        "id": { "type": "long" },
        "list": { "type": "long" }
      }
    }
  }
}

I need to partial-update the index above in a concurrent manner.

The list contains array of long type data (maintains up to 1000s of them) and, a new long type data will be appended into it frequently.

I observed that the partial update has occasionally failed with a status code of 409.

So, I ended up setting retry_on_conflict=5 like this;

public function setIntroLimitAddParams(int $id, int $id_to_be_added)
{
    return [
        'id' => $id,
        'index' => 'intro_limits',
        'type' => 'doc',
        'retry_on_conflict' => 5,
        'body' => [
            'script' => [
                'source' =><<<EOT
if (ctx._source.list.contains(params.x)) {ctx.op = 'noop'}
ctx._source.list.add(params.x)
EOT
                ,
                'lang' => 'painless',
                'params' => [
                    'x' => $id_to_be_added
                ]
            ],
            'upsert' => [
                'id' => $id,
                'list' => [ $id_to_be_added ]
            ]
        ]
    ];
}

Ever since I set the retry_on_conflict parameter, 409 error has been completely gone. But, within my use case, the integrity of data entries is very important. No missing data entry is allowed.

My question here is that letting Elasticsearch resolute the conflict with the retry_on_conflict parameter guarantees "I will never experience any chances of data loss?"

I've just experienced that one of list data is NOT appended while I was testing my application... even though I get all 200 status (no 409 status) responses from Elasticsearch. What makes me frustrate is that now I can't reproduce the case again.

So, I'm wondering if it's sensible to presume that using retry_on_conflict parameter is appropriate in my use case. Or, instead of using retry_on_conflict parameter, I should deal with the 409 error manually by sending requests until I get 200 on application level.

Any advice will greatly be appreciated. Btw, I'm using Elasticsearch 6.0.1

Thanks.

Chuck.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.