Reindex GC overhead

timprosteps · March 6, 2018, 8:59am

Hi,

We are trying to do a reindex of one of our live indexes which contains 9.6 million docs and is 48 gig in size.
Elasticsearch is running on 1 machine with 36 CPU cores and 60gig ram. I gave Elasticsearch 30gig as heap size (in the jvm.options file).
We first tried to do this on a backup of this machine which worked perfectly; it took around 30 minutes to reindex all data.

Now we did the same on our production machine and it started out great; it did about 7 million documents on just under 30 minutes but then it slowed down so much it took 10 minutes to do 1 batch, finally resulting in elasticSearch not returning results anymore and reindex just seem to be stopped. The task was still active though.
After cancelling the task I can see the index has build 8 million docs so it was pretty close, unfortunately it is not usable like this.

Some more info:

We are running ES 5.3.0
New index has refresh interval set to -1 to speed up indexing.
Reindex happens in batches of 5000.
Machine is using about 450% CPU (100% = 1 core) and about 9g ram during the reindex.
I'm seeing alot of these errors:

"[2018-03-06T08:34:58,699][WARN ][o.e.m.j.JvmGcMonitorService] [BcXdUDQ] [gc][2744] overhead, spent [1.1s] collecting in the last [1.7s]"

I suppose this is the issue? Any way I can prevent these and get my reindex through? The machine is heavy enough to easily do this I think.

system · April 3, 2018, 8:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindex API - Extremely Slow Elasticsearch	2	1487	March 16, 2019
Improve reindex speed into new cluster Elasticsearch	4	1090	January 5, 2019
Elasticsearch gc overhead Elasticsearch	1	1265	March 23, 2020
Improving performance of reindex API? Elasticsearch	7	12146	July 5, 2017
Debugging extremely slow indexing Elasticsearch	39	6562	February 16, 2021

Reindex GC overhead

Related topics