Increase in Indexing Time and big Merges

Mihir_M · May 15, 2014, 5:37am

Hi,

We have an ES cluster of 7 nodes in our development environment. 3 of which are Master nodes and the rest 4 are Data nodes. Masters run with default heap and Data nodes with 4 GB heap on 8 GB RAM machines.

We do bulk inserts continuously and simultaneously fire aggregation queries every minute. We maintain day-wise indices with 5 shards and 1 replica. Our bulk count is 1500 docs and bulk size is 370 KB approx. So our everyday's index size amounts to 35 GB approx.

We are observing insertions slowing down due to increase in indexing time, which reaches to about 6-7 sec. This is observed after 12-13 hours of insertions into an index. The behaviour repeats for every index.

Also, we observe Merges as high as 16 GB on some data nodes during the same time. In the Index stats the merges are seen to around 32 GB. This is seen to affect the overall performance of ES.

We have tried various merge level settings like increasing segments_per_tier(to 15), reducing index.store.throttle.max_bytes_per_sec to something like 10 MB and reducing merge.policy.max_merged_segment to 2GB. These have managed to reduce big merges, but indexing time increase is still observed.

Please guide us on ways to have consistent indexing time and insertion rate, and how to minimize the effects of Merging.

Thanks
Mihir

Ivan · May 15, 2014, 6:10am

You can try to lower the number of max_merge_at_once. Not sure if
throttling the store would be effective. Wouldn't that cause a bottleneck
and exhaust the bulk threads?

--
Ivan

On Wed, May 14, 2014 at 10:37 PM, Mihir M mihirsm90@gmail.com wrote:

Hi,

We have an ES cluster of 7 nodes in our development environment. 3 of which
are Master nodes and the rest 4 are Data nodes. Masters run with default
heap and Data nodes with 4 GB heap on 8 GB RAM machines.

We do bulk inserts continuously and simultaneously fire aggregation queries
every minute. We maintain day-wise indices with 5 shards and 1 replica. Our
bulk count is 1500 docs and bulk size is 370 KB approx. So our everyday's
index size amounts to 35 GB approx.

We are observing insertions slowing down due to increase in indexing time,
which reaches to about 6-7 sec. This is observed after 12-13 hours of
insertions into an index. The behaviour repeats for every index.

Also, we observe Merges as high as 16 GB on some data nodes during the same
time. In the Index stats the merges are seen to around 32 GB. This is seen
to affect the overall performance of ES.

We have tried various merge level settings like increasing
segments_per_tier(to 15), reducing index.store.throttle.max_bytes_per_sec
to
something like 10 MB and reducing merge.policy.max_merged_segment to 2GB.
These have managed to reduce big merges, but indexing time increase is
still
observed.

Please guide us on ways to have consistent indexing time and insertion
rate,
and how to minimize the effects of Merging.

Thanks
Mihir

Regards

View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Increase-in-Indexing-Time-and-big-Merges-tp4055918.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1400132247263-4055918.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBuiKrHQspviENdsd0xkCAvWp7qyewBzPy1LCy%3DCTH0MQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

elias · May 15, 2014, 7:26am

Hi Mihir,
i had the same problem, index time increasing from about 3 sec for a bulk of 100k, to over 500sec. After increasing the number of shards from 1 to 4 per node, and setting indices.memory.index_buffer_size to 20%, the indextime is quite constant around 3-5 sec.

Topic		Replies	Views
Suggestion needed on Indexing Performance Elasticsearch	1	493	July 6, 2017
Horizontal scaling of indexing Elasticsearch	8	1996	July 5, 2017
Bulk write to ES \| best practices Elasticsearch es-hadoop	4	5525	July 6, 2017
Bulk import response times Elasticsearch	4	1795	July 5, 2017
Bulk indexing slow down when data amount increase Elasticsearch	6	2948	July 6, 2017

Increase in Indexing Time and big Merges

Regards

Related topics