Elasticsearch real time write indexing performance

Vinod_Kandula · May 1, 2020, 7:19pm

Hi,
pls help me with ES write performance for real time indexing and how soon the document is available for search. To establish this in my jmeter performance script, after every document index, a fixed delay of 300ms or more is applied before searching for this document but the indexing is not scaling. What is the way to scale up for 2000+ docs per second

Christian_Dahlqvist · May 1, 2020, 8:52pm

By default indexed documents are made searchable every second. This is an expensive operation which can be controlled through the refresh interval setting.

Vinod_Kandula · May 2, 2020, 1:55am

I do the refresh on every document index but the problem is the indexing doesn't scale.

Christian_Dahlqvist · May 2, 2020, 6:37am

Indexing individual documents rather than using bulk requests results in a lot of overhead and will lead to significantly worse indexing performance and throughput. Refreshes are even more expensive operations, so calling this for every document will add even more overhead. Indexing that way is expected to lead to very bad performance and lot of overhead and will not scale as you basically are basically doing the opposite of these guidelines for optimizing indexing performance.

Indexing this way is likely to cause a lot of small disk I/O, so it may be useful to look at disk utilization, iowait and IOPS to see what that looks like.

Vinod_Kandula · May 2, 2020, 6:43pm

Thanks! but the use case that I have is real-time indexing i.e as soon as any updates are happening in the system required to be indexed and should be available for search and aggregate operations. So could you please suggest something or ES is not used for the real-time indexing use cases?

Christian_Dahlqvist · May 2, 2020, 6:46pm

Elasticsearch is not optimised for that use case.

Vinod_Kandula · May 2, 2020, 7:02pm

Thanks again. Elastic advocates says same thing? @dadoonet @Christian_Dahlqvist can you confirm this?

dadoonet · May 3, 2020, 4:09am

You can definitely trust what @Christian_Dahlqvist says. Elasticsearch is a Near Real Time Search engine.

Vinod_Kandula · May 3, 2020, 4:53am

Thanks! But within one second refresh interval, we are not able to get the maximum indexing throughput, is there anything else we can do about this?

Christian_Dahlqvist · May 3, 2020, 5:19am

Follow the guidance around optimising indexing performance I linked to. The one that often makes the biggest difference is using the bulk API and not index individual documents. Making sure you have fast storage is also very important.

system · May 31, 2020, 5:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Realtime search structure Elasticsearch	4	310	July 6, 2017
Need help with performance insights Elasticsearch	1	324	July 6, 2017
Suggestion needed on Indexing Performance Elasticsearch	1	493	July 6, 2017
Performance Numbers of ES suggest v/s MongoDB Elasticsearch	3	2401	July 5, 2017
Index speed? Elasticsearch	2	719	February 15, 2017

Elasticsearch real time write indexing performance

Related topics