I'm reindexing timeseries data into bigger indices. I have daily indices like logs-yyyy-MM-dd
and I'm reindexing it to bigger indices using:
POST _reindex
{
"source": {
"index": "logs-2024-08-*"
},
"dest": {
"index": "logs-2024-08"
}
}
I've done this by one month at a time and it has taken some time it has generally worked fine. The document count for each month has been 50-60 M.
For more recent indices the document count is higher - 80-120 M. The index rate seems to have dropped abruptly around 45-50 M documents, and steadily decreasing after...
When starting new reindex jobs now, the index rate never approaches the rates I had earlier.
It's a cluster with 3 EC2 nodes - using EBS gp3 volumes. I can see that Volume IOPS exceeded check has been 1 since about the same time as the abrupt drop in index rate, but since gp3 are not burstable I don't think this is the issue? (I'm not allowed to include more screenshot since I'm a new user).