Refresh strategy for real-time

bviz · April 3, 2019, 1:58pm

Hello.

We use Elastic to index about 260M documents, about 150GB in size. We need the indexed data to be available in real-time (we query the data right after we index it), so we call a refresh after indexing documents, which turned to be, expectedly, to be a performance problem for our cluster when there is high load.

We tried to reduce the need of real-time data in elastic as much as we could, but we have reached some barriers that we couldn't solve without doing manual refreshes after indexing.

I am wondering if there are any best practices around real time and refreshes that we could implement to overcome the real-time limitation, and have the data available right after index time.

I've read about the wait_for refresh configuration, but I'm affraid that at high load, it will create alot of refresh listeners in elastic and the queues will be full.

Thanks in advance!

DavidTurner · April 3, 2019, 4:15pm

Do you mean you've tried ?refresh=wait_for and found it to cause problems? I ask because if you create too many refresh listeners then this triggers a refresh, which sounds like it might be what you are wanting.

system · May 1, 2019, 4:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Realtime search structure Elasticsearch	4	310	July 6, 2017
Real time indexes Elasticsearch	7	581	July 6, 2017
ElasticSearch - Refresh issue ? Too many Requests ? Can't find documents randomly Elasticsearch	17	2873	June 14, 2021
`refresh=wait_for` taking unexpectedly long Elasticsearch	5	4487	May 25, 2017
Increasing index refresh_interval flaws Elasticsearch	2	669	May 16, 2017

Refresh strategy for real-time

Related topics