I'm doing some reindex of daily index to monthly ones and sometimes the reindex process enters in a loop and starts again after it already reindexed the documents.
For example, considering the index indexName-2021.05.01 with 500.000 documents and reindexing it into indexName-2021.05, I start the reindex process with the following request:
POST _reindex
{
  "source": {
    "index": "indexName-2021.05.01"
  },
  "dest": {
    "index": "indexName-2021.05"
  }
}
Then I use GET _tasks?actions=*reindex&detailed to get the task id and GET _tasks/taskID to monitor the progress.
I can see the created number increasing as expected.
          "status" : {
            "total" : 500000,
            "updated" : 0,
            "created" : 184000,
            "deleted" : 0,
            "batches" : 184,
            "version_conflicts" : 0,
            "noops" : 0,
            "retries" : {
              "bulk" : 0,
              "search" : 0
            }
I also use the Discover in Kibana to compare the documents, in one tab I filter by the daily index and in the other i filter by the monthly index, both using the same time interval, which correspond the the time interval present in the daily index.
After sometime I check again for the progress using the task id and got an answer saying that the task doesn't exist anymore, from discover I can see that all documents were reindexed as expected.
But if I run GET _tasks?actions=*reindex&detailed to get the tasks running I can see few tasks doing reindex for the same index, but now they show the updated part of the status response increasing.
          "status" : {
            "total" : 500000,
            "updated" : 175000,
            "created" : 0,
            "deleted" : 0,
            "batches" : 175,
            "version_conflicts" : 0,
            "noops" : 0,
            "retries" : {
              "bulk" : 0,
              "search" : 0
            }
My question is: What can cause this to happen since the documents were already indexed?
Sometimes this reindex loop takes a long time, which makes the load and CPU of the nodes doing the reindex process to increase a lot.
Normally when this happens and I can confirm that all the documentes were reindexed, I cancel all the reindex tasks using:
POST _tasks/_cancel?actions=*reindex
But I would like to know why this is happening and how to avoid it.
I'm currently running version 7.9.3, an upgrade to 7.12.1 is planned for the next weeks.