We are indexing around 7TB on a daily basis.
All the indices are being replaced once a day with a new ones (fresh data).
Each index represent one customer (business).
The variety of indices is big, from a few kilobytes index up to a 500 gigabytes index.
After being indexed, no more writes to an index.
It will be used for search only until a new index will be ready (in the upcoming day) and replace it.
We are using alias per index (when a new index is ready, we flip alias and delete the old one).
The "new"/"ready" indices are relocated into a dedicated nodes group (we call it "data-srv").
We have 4 nodes (64GB, 8 cores, 2.5TB) in this nodes group.
The last step before we replacing an index with a fresh one is changing the replication factor from zero to one. There could be dozens of indices that tries to add replica in the same time.
Recently we noticed that a lot of shards become unassigned during that time. Trying to analyze we see those UNASSIGNED state with REPLICA_ADDED as a reason.
Seems like the number of initializing shards and the number of relocating shards are limited, but we didn't figure out how this limitation works.
We assumed that there's a limitation / configuration and we tried to tweak those values:
indices.recovery.max_bytes_per_sec: 150mb. // default 40mb
cluster.routing.allocation.node_concurrent_recoveries: 16 // default 2
The intuition behind that was that by changing those we can increase parallelism/concurrency.
Then, we saw the following logs:
Unable to acquire permit to use snapshot files during recovery, this recovery will recover index files ...
We reverted this change (after reading this) and now we are waiting too much time for replicas.
- Any idea / suggestion how to increase the parallelism without getting those warning logs?
- Does it worth "marking" the indices as read-only somehow? will it improve performance somehow? make operations faster?