Index Settings for Optimizing Indexing Throughput

dawiro · April 24, 2019, 8:04pm

Hi,
In our logging clusters we're indexing at around 40k msg/sec and 500gb per index per day. The storage backend is ceph and sometimes it struggles as load increases over time etc.

Consequently, I'm toying wondering what I might do to optimize indexing in a way that accounts for sub-optimal storage. One of those things I'm considering is to double the size of index.translog.flush_threshold_size. Is this considered a good move when looking to maximise throughput? We're already have index.refresh_interval set to 30s so I figured making this change could align reasonably well with that...

Regards,
D

DavidTurner · April 25, 2019, 6:17am

I'm not sure it'll make much difference, but you can try.

Looking at the manual page on tuning for indexing speed the advice to avoid network-attached storage stands out. Could you move to a hot/warm architecture? This would allow you to index to fast local disks and then move today's indices into your Ceph cluster once indexing is complete.

system · May 23, 2019, 6:17am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Optimum index.translog.flush_threshold_ops setting Elasticsearch	2	1762	July 6, 2017
Optimum index.translog.flush_ threshold_ops setting Elasticsearch	1	304	July 6, 2017
Side effect of decreasing index.translog.flush_threshold_size Elasticsearch	2	1040	January 25, 2017
Elasticsearch Performance Problem Elasticsearch	5	810	June 1, 2017
Dec 21, 2017: [EN][Elasticsearch] Knobs to turn for better indexing performance Advent Calendar	1	2164	August 23, 2018

Index Settings for Optimizing Indexing Throughput

Related topics