More Index replicas = faster sync of data?

Hello! I have a scenario that seems to function opposite documentation / logic. I hope someone can help me.

environment:
ES 6.8 3 node cluster on ALMA9 w/basic configuration (We should upgrade but can't at this time)
Each node has 300G RAM overall / 220G for JVM - 30 cpus
Our data is mostly in 1 index. We have about 9 other indexes but they are very small. We have 7-12M documents at a size of about 200G. (No big deal by ES standards, I think)

Workflow:
Periodically we drop the ES database and resync from Mongo. At that time can configure the number of Primary indexes and Replicas. With 3 nodes ,I thought 3 primary and 3 replicas for each index sounded like the way to go, based on documentation. Sync takes 5-6 hours. It's the worst. During Sync CPU on nodes barely moves, disk I/O is minimal if any change, RAM use increases a tiny bit, network traffic is silly low 50kb/sec.

Through trial and error I have discovered that if I set the primary and replica index number greater (even much greater) I can reduce sync time by 50-75%. This sync speed trends steadily upward until it reaches a point of diminishing returns. ie: 1primary:3replica per index is 5+ hours, 10primary:150 replica is 2.5+ hours.

Nothing seems to appear in the logs. The data doesn't change. This is true when syncing to a single node cluster or the 3-node cluster. This is unchanged when configuring a sync w/settings mentioned above and even when configured with 1 Index & 0 Replicas.

My thought process:

  1. Syncing additional primary and replica indexes would logically require more work vs syncing fewer indexes. ...right?
  2. Syncing to a 3-node cluster should be 3x the speed of syncing the same data to a 1 node cluster. (at least w/Replication=0
  3. Lord, why does sync take so long?

Am I correct in my assumptions? ... in being perplexed?

A related question:
We do our bulk sync/index 2k docs at a time. We send all requests to the ip address of one of our 3-nodes. When I configured the sync w/0 replicas, for testing, I took a screenshot I'll attach here. Obviously each node has been assigned an index.
4. It would seem logical that ES would write the data of any single index across all nodes, for performance. Is that true? Is there a way to verify it's working and not just assigning an index/node based on session (or something )?

Thanks!

Hi @SQIGGLES and welcome!

Regarding replicas: if you've 3 nodes then any more than 2 replicas should make no difference (the remaining 148 replicas will be left unassigned), but I'd expect 0 replicas to be faster than 1 or 2 replicas, independent of the number of primaries.

Regarding primaries: I'd expect increasing the number of primaries to decrease the latency for individual bulk indexing requests, but the only way this could increase your overall throughput is if you simply aren't pushing Elasticsearch hard enough. I expect you will be able to get the same kind of speedup (with less per-shard overhead) with just 3 primaries if you send many more bulk requests to ES all at the same time. The ideal situation during bulk indexing is for ES always to have at least a little bit of work queued up on each node, that way it'll never have any idle threads and you'll be using your resources to the full.

1 Like

Also yeah oof 6.8 is really old, you're missing out on almost 5 years of improvements by not upgrading. There's probably limits to how hard you can get away with pushing a 6.8 cluster before it starts to break. Recent 8.x versions are much more robust.

1 Like

Thanks so much

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.