I have recently created a new index via the reindex API. I am now running both indexes at the same time and indexing/deleting documents to both. I have noticed that the document counts between the two indexes are off slightly (basically between 0 and 300). The difference in item counts between the two indexes varies over time and can become bigger/smaller.
To calculate the difference between the two I've been using the cat count api and the indexes in question have ~3.4m items.
Is this to be expected? I feel like everything is okay and that there's some kind of background processes that causes the counts to be off slightly.
Just like _update_by_query, _reindex gets a snapshot of the source but its destination must be different so version conflicts are unlikely.
So if you are changing the source while reindexing ... I would expect some inconsistency between the source / destination after the reindexing is complete.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.