How to improve refresh time?

Pavel_Pokorny · November 8, 2018, 10:13am

Hello,

In our production cluster on AWS we have 8 i3.xlarge ingest+data nodes and 3 m3.medium master nodes. Indices have refresh_interval=-1, translog.durability=async, 6 shards and 2 replicas.

We have a service called persister that takes the data from Redis and persist them to Elastic in batch requests. Each persister run takes around 2-3 minutes and it indexes 150k items in bulk requests containing 2k items. After the index phase it refreshes the indices one by one. Usually ca 6 indices are affected.

What I'm not happy about is that the refresh phase takes usually around 80% of the time and usually it times out after 60s for the largest indices.

What is affecting the refresh time most and how could we improve it? Does our index/refresh flow makes sense or are there some other best practices that we should follow?

More things that could be important:

Except the persister there's one other service indexing data to elastic. It runs in 10 instances and every 60s every instance indexes around 10k items to the same indices as the persister. I doesn't force any refresh.
Data in shards are not distributed equally (eg. 23.8gb, 45.5gb, 30.6gb, 13.9gb, 12.1gb, 21.1gb). This is probably because we're using routing for all items and some routing groups have much more data than others.
We create a new indices every week and index new data to these new indices.

index                        pri rep docs.count store.size pri.store.size      tm  pri.tm
xxx_2018-10-15    5   2  514448277    691.7gb        230.5gb   3.5gb   1.1gb
xxx_2018-10-29    6   2  460991107    610.4gb        203.4gb   3.2gb     1gb
xxx_2018-10-08    5   1  547700252    490.3gb        245.1gb   2.7gb   1.3gb
yyy_2018-10-29    6   1  332578930    301.8gb        150.9gb   2.2gb     1gb
xxx_2018-11-05    6   2  316782562    424.2gb        144.4gb   2.1gb 725.6mb
xxx_2018-10-22    6   1  333790508    298.2gb        149.1gb   1.6gb 826.2mb

We have one nested property and we use parent/child relations between those 4 types.
We use Elasticsearch 5.5 and have 4 types. The largest type has 45 properties.

system · December 6, 2018, 10:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Refresh latency Elasticsearch	7	801	November 8, 2017
Tuning index search query, refresh and flush times Elasticsearch	3	766	July 5, 2017
Refresh is very slow Elasticsearch	7	2978	June 19, 2018
Index speed? Elasticsearch	2	719	February 15, 2017
Poor Update Performance Despite Refresh Interval Compromise Elasticsearch	2	485	July 6, 2017

How to improve refresh time?

Related topics