Reindex from multiple indices to a new one _with_new_document_ID's_!


(Dennis Dyall Wallin) #1

Hi! I tried _reindex to reindex logs-2016-11.* to logs-2016-11.

I expected about 9 000 000 docs to be in the resulting index.
Then I did a _cat/indices/logs-2016-11?v
And I see that the index contains about half of the documents, and the other half seem to be represented in the 'docs.deleted' column.
I assume this is because clashing ID's. Also when I inspected the reindex task, I saw a lot of docs were being deleted and only a few created.

So does this mean that the daily indices in logs-2016-11-* can have clashing ID's and therefor the reindex operation is just updating documents with the same ID?

In that case - how do I tell Elasticsearch to reindex them but with new ID's?
I tried:

{
  "size": 10,
  "source": {
"index": ["logs-2016-11-*"],
"_source": [<list of properties I want to keep EXCLUDING _id>]
  },
  "dest": {
"index": "tesssssst"
  }
}

And I also tried to use a script:

script: "ctx.source._id = null"

Then I got a null pointer exception.

Please advise!


(Dennis Dyall Wallin) #2

I think I solved this by using

ctx._id = null


(Nik Everett) #3

ctx._id = ctx._index + ':' + ctx._id might also work if you need to preserve the ids.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.