Big drop in indexing rate when reindexing

I'm reindexing timeseries data into bigger indices. I have daily indices like logs-yyyy-MM-dd and I'm reindexing it to bigger indices using:

POST _reindex
{
  "source": {
    "index": "logs-2024-08-*"
  },
  "dest": {
    "index": "logs-2024-08"
  }
}

I've done this by one month at a time and it has taken some time it has generally worked fine. The document count for each month has been 50-60 M.

For more recent indices the document count is higher - 80-120 M. The index rate seems to have dropped abruptly around 45-50 M documents, and steadily decreasing after...

When starting new reindex jobs now, the index rate never approaches the rates I had earlier.

It's a cluster with 3 EC2 nodes - using EBS gp3 volumes. I can see that Volume IOPS exceeded check has been 1 since about the same time as the abrupt drop in index rate, but since gp3 are not burstable I don't think this is the issue? (I'm not allowed to include more screenshot since I'm a new user).

I had ~10 re-indexing jobs running so I canceled 3 (those with most documents remaining). After this Volume IOPS exceeded metric changed from 1 to 0 (only one of the 3 had exceeded this) and the indexing rate boosted up to 2000-3000/s. So, problem solved - and volume IOPS exceeeded is crucial to watch out for!

2 Likes