Refresh strategy for real-time


We use Elastic to index about 260M documents, about 150GB in size. We need the indexed data to be available in real-time (we query the data right after we index it), so we call a refresh after indexing documents, which turned to be, expectedly, to be a performance problem for our cluster when there is high load.

We tried to reduce the need of real-time data in elastic as much as we could, but we have reached some barriers that we couldn't solve without doing manual refreshes after indexing.

I am wondering if there are any best practices around real time and refreshes that we could implement to overcome the real-time limitation, and have the data available right after index time.

I've read about the wait_for refresh configuration, but I'm affraid that at high load, it will create alot of refresh listeners in elastic and the queues will be full.

Thanks in advance!

Do you mean you've tried ?refresh=wait_for and found it to cause problems? I ask because if you create too many refresh listeners then this triggers a refresh, which sounds like it might be what you are wanting.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.