Improving Reindex Performance in v5.6

ndtreviv · December 20, 2018, 10:38am

I'm currently reindexing 3.2bn documents from a 5 shard index into a 20 shard index on a 2-node cluster where the machines are AWS EC2 i3.4xlarge instances.

The elasticsearch installs are configured to use both ephemeral NVMe SSDs (each at 1.9TB).

The reindexing runs through a Groovy script that simply takes the old ctx.id, calculates a SHA-256 of the value and resets it.

The batch size on the reindex API call was set to 10,000.

The source and destination indices have 0 replicas, and the refresh interval on the destination index is -1.

Document routing is being used. Shard data is lumpy as hell, but that's a problem for another day.

No other activity is happening on this cluster.

What bothers me is that it's taking a long time, but netdata stats are showing that the disk utilisation is very low, memory is well within limits (although I'm getting a saw-tooth JVM Heap graph, so may need to decrease heap size?), and CPU doesn't peak above 25%. This is the case with both nodes.

top shows that java is indeed the top process, but is only using 2 cores.

The nodes API with the os flag enabled shows that it has correctly identified and allocated the available number of processors.

Is there any way I can configure elasticsearch to totally smash through the data and stress the machines a bit more? At the current rate it's going to take months to complete.

I would consider adding more nodes, but is there any point given that the existing ones aren't being used to their maximum potential?

Any help appreciated.

ndtreviv · December 20, 2018, 10:58am

Getting the hot threads show that bulk processes are doing their thing, so I'm wondering about changing the thread_pool.bulk settings, currently defaulting to this:

"bulk": {
    "type": "fixed",
    "min": 16,
    "max": 16,
    "queue_size": 200
},

Would it be a good idea to increase the min and max?

nik9000 · December 20, 2018, 2:15pm

In general when folks have a lot of reindexing to do I suggest that they split the process up into multiple reindex requests that'll finish in tens of minutes based some natural key in their data. And then they can parallelize the process with whatever tools they like. And they can spot check the results and all that.

ndtreviv · December 20, 2018, 2:25pm

Can I run multiple batches simultaneously?

ndtreviv · December 20, 2018, 2:46pm

Too late! I'm already doing it yeeeeehaaaahh!

Now we're getting some utilisation

nik9000 · December 20, 2018, 3:14pm

Great! Yes, totally run multiple batches in parallel. If you push it too hard then the reindex will get execution rejections and will back off and retry. This isn't great for speed though. In other words: if you push it too hard it'll go slower. But you can and should manually parallelize it.

ndtreviv · December 21, 2018, 10:47am

Each of my nodes has two disks. My index has 20 shards, there are 10 shards on each node.

I've noticed that one of the nodes is working harder than the other one (IO-utilisation).

A closer look shows that one node has got 5 shards per disk, the other one is a bit more unbalanced, with 3 on one disk and 7 on another.

How is this decision made? Is it related to the shard size? (some of my shards are much larger than others due to lumpy document routing). Is there a way to force es to evenly spread the shards to each disk, and is that even a good idea?

ndtreviv · December 21, 2018, 10:53am

Actually, I've just seen this: Shards Distribution in Multiple disks

It looks from the data that the one with seemingly fewer shards from my index has more data for other miscellaneous/internal indices (created by Kibana/Monitoring etc).

That sort of makes sense. It's useful to know. In future, for example, I'd hold off on installing kibana until my indices have spread their data nicely across all the disks.

system · January 18, 2019, 10:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindex API performance Elasticsearch	3	4595	July 5, 2017
How Can I increase ES's indexing Data speed?Bulk can't achieve it! Elasticsearch	12	1306	July 5, 2017
How to increase indexing speed? Elasticsearch	5	5439	April 18, 2017
Best way to take advantage of resources Elasticsearch	3	335	July 6, 2017
Bulk indexing slow down when data amount increase Elasticsearch	6	2990	July 6, 2017

Improving Reindex Performance in v5.6

Related topics