Impact of default indices refresh time on Bulk Indexing

Hi Group,

During Bulk Indexing, when the index.refresh_interval defaults to 1sec and RefreshPolicy set to NONE.

Query: In my case - Save Bulk Request runs in batches with 1k records in each batch. Out of 1k records, suppose 400 records are indexed in 1 sec and Indices are refreshed. Will the first 400 records be available for parallel search query (even though bulk transaction is not complete) ?

Additional Queries : What is the optimum setting for index refresh interval? Default is 1sec.
Is this good to go? Does it become a point of contention later when index size grows ?
What are the major aspect on deciding the index refresh interval?
In what all scenarios the default refresh interval of 1 sec can have an adverse impact?
Does this 1sec Vs 2sec Vs any interval has anything to do with the number of shards or replicas? Please suggest.

Yep they will.

Thanks for confirming.

Please help insight on the additional query updated in the section.

It's not clear what you are asking for there.

Thank you for quick response. Will reach back if any additional clarification needed.

What is the optimum setting for index refresh interval? Default is 1sec.
Does it become a point of contention later when index size grows ?
What are the major aspect on deciding the index refresh interval?
In what all scenarios the default refresh interval of 1 sec can have an adverse impact?
Does this 1sec Vs 2sec Vs any interval has anything to do with the number of shards or replicas? Please suggest.

What are you trying to solve here?

Hi Warkolm,

We are building a Search Service solution using search capability of ES. So, to make the service more efficient trying to gauge the pros and cons of default index refresh interval time setting to avoid any performance impact during Bulk Indexing and parallel Search, in clustered environment.

The recent query was to understand the impact and major driving factor on which the default setting to be used Vs when it can have adverse affect and needs to be modified to higher no. of seconds. (which is not called out in the documentation).

Note: Our indices are going to have documents in the range of 100s of million if not billion.

The core answer to all of this is that you need to test based on your expected use case and workloads and find the balance.

However, I don't think that most of what you are worrying about will be that much of an impact.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.