Why does reindex cause Updates?

I am carrying out a reindex operation, to a new index, but get results like

INFO:__main__:completed task Xp8tUzxiScenFu_pV7gmQQ:1558654
INFO:__main__:{'deleted': 0, 'batches': 149, 'version_conflicts': 0, 'total': 148161, 'created': 148159, 'noops': 0, 'throttled_until_millis': 0, 'updated': 2, 'requests_per_second': -1.0, 'throttled_millis': 0, 'retries': {'bulk': 0, 'search': 0}}

So as a result, i am creating new records for nearly all, but updated 2 records.

Before the reindex operation, the destination index does not exist. It gets created
Why is the index creating these updates , if i am simply reindexing to a new virgin index?

I do not understand how this is possible.

Thanks for any help you can give.

Could you show us the body of the reindex query? and Do you set the IDs manually?

Thanks, please find the body of the reindex below. I don't set the ids manually. All i do is include an if id ="" noop, because i was getting failure because of non existent IDs..(didn't understand that either, but the script fixed it..)

Admittedly i am going from daily index to a monthly index...but all the same it seems to me that the probablity of 2 ids colliding by chance have to be much lower than the number of "updates" i am getting.

curl -XPOST "http://localhost:9200/_reindex?wait_for_completion=false" -H 'Content-Type: application/json' -d'
{
  "conflicts": "proceed",
  "source": {
    "remote": {
      "host": "http://xxx.xx.x.xxx:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "logstash-2017.01.*",
    "type": ["api-log", "search-log"]
    
  },
  "dest": {
    "index": "logstash-2017.01"
  },
  "script": {
    "source": "if (ctx._id==\"\"){ctx.op=\"noop\"}else{ctx._source.doctype=ctx._type;ctx._type=\"doc\";}",
    "lang": "painless"
  }
}'

Thank you,
What is the error message you get when you remove the "conflicts": "proceed"?
I noticed in the task response, there's no "noops" documents ('noops': 0). So maybe this condition is not working properly and lead to the update?
Could you also check that you have unique Ids for documents in types "api-log" and "search-log", because you're reindexing them all in the same index "logstash-2017.01"?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.