Not all documents copied after reindex

I modified a template for the index to convert a couple of variables from string to long.
Then I issued a reindex call:

POST _reindex
{
  "source": {
    "index": "index-old"
  },
  "dest": {
    "index": "index-new"
  }
}

No errors anywhere, but I'm missing about half the documents when the task completes:

get _cat/indices/index-*
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   index-old tEdSeRmFQ7WhGFoWb4vYVw   6   1     993613            0      1.9gb        986.7mb
green  open   index-new 3KOMWY9lSyyASJS9TgPMEQ   6   1     532999            0        1gb        541.2mb    

I've done this many times before and reindex is usually very reliable. How can I figure out what happened to the rest of the documents?

Thanks!

1 Like

Is there anything in your Elasticsearch logs that might explain it?

Nothing. I was tailing the logs in real time on the server where this task was running. Not a single entry.

Very odd, can you post the mappings from each index and a sample doc?

Mappings are ~800 lines long. Do you want just the differences or the whole thing?

What about putting it into gist/pastebin/etc?

reindex not completeļ¼Ÿ
see the tasks show the information
/_cat/tasks?pretty&v

Reindexing task definitely completes. I monitor it in the log files and with

GET _tasks?detailed=true&actions=*reindex

Old mapping:
https://pastebin.com/fxnsHE27

New mapping:
https://pastebin.com/8dGyEgFe

Example doc:
https://pastebin.com/ZTQLqxTS

Another index is now exhibiting identical problem. Copies about half the data and just stops without errors. This one is much larger so I don't think its a resource problem.

Any ideas?

What version are you on?

elasticsearch-6.2.3

Couple of other things I noticed:
It doesn't always stop on the same document.
Out of total of 1151419 documents, first run quit on 634999 and second on 651999.

I was also looking at _nodes/stats and didn't see any significant differences in nodes running the reindex tasks and those that weren't.

I tried reindexing into a different name to make sure errors in the template aren't affecting reindexing and the same problem exists. Only about half the data got copied.

Does anyone have any other suggestions for troubleshooting/debugging this?

Thanks!

It might be worth raising an issue on GitHub.

Slightly unrelated question:
What happens when template includes a mapping for a variable to be cast as long, but variable comes in with quotes like "1024"? Does it get discarded? I noticed that the sum docs.count + docs.deleted are close to the total number of documents in the original index. So while there are still some documents completely missing from destination index this would at least explain part of it.

In my latest test out of 1151419 total documents, destination index contains docs.count = 684045 and docs.deleted = 196412. So while 270962 are still missing, at least 200k are simply deleted.

I also tried reindexing into a name that doesn't match any templates and all documents made it across. So while it might still be a bug worth reporting on GitHub I suspect I just don't have a full understanding of how mapping changes affect reindexing.

Deleted docs could be an indicator of things being overwritten.

Are you using beats as the data source? Are they passed to anything else before they hit Elasticsearch?

Deleted docs only show up during reindexing, but original source is filebeat.
Full path for data is filebeat -> logstash (all filtering happens here) -> redis -> logstash -> elasticsearch.

I'm not altering the data in any way during reindexing. Only the mapping template is different since I'm trying to remap a couple of strings into longs.

Do I need to do it by casting the variable in a script or should updating mapping in a template enough?

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.