Overwriting a whole index without downtime best practices

krezno · July 10, 2023, 6:06pm

I have several batch pipelines that produce a new version of the result each time. The data can't have any downtime so I can't just delete the index before writing. The current solution is to create a new index with the run date each time and point an alias to the latest index after writing, letting ILM take care of the old indices.

There are two main problems here:

If the pipeline stops for long enough there may be data donwtime if ILM deletes the pipeline before the next run. This I am planning to solve by programmatically deleting the previous index after updating the alias.
If the index under the alias changes mid read (for example when paginating the results) the data becomes inconsistent as the first query will be against the old index and after alias switch the "next" pages will be from the new index, which has different data.

Is there a better way to do this? I couldn't think of a way to address the second problem that isn't prohibitively complicated to implement for the benefit it brings.

eMitch · July 10, 2023, 9:15pm

Hey @krezno !

Your strategy is sound and a pretty common scenario. I agree with your assessment of the first problem - better to tie the deletion to your processes properly rather than let ILM deal with it, if the indices are that necessary.

as for the second problem - I also agree. there are ways to "solve" or at least mitigate the issue some, but at the cost of complexity. It may be worthwhile for you to see how many users are impacted in that scenario to help you decide if additional complexity is worthwhile.

are you monitoring pagination calls and/or open search contexts at the time you do your alias cutovers?

system · August 7, 2023, 9:15pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindexing with zero downtime - update document Elasticsearch	6	585	May 27, 2024
Reindex Approach with zero downtime but with minimum delay Elasticsearch	3	612	September 11, 2024
Reindexing a big index without down time Elasticsearch	1	244	November 12, 2023
Reindex/Rotate best practices Elasticsearch	3	1139	July 6, 2017
Reindex data without compromising on data loss & search Elasticsearch	1	708	December 12, 2019

Overwriting a whole index without downtime best practices

Related topics