Reindex Approach with zero downtime but with minimum delay

shanay_20 · September 4, 2024, 8:46am

Hi Team,

I want to reindex the multiple indexes that have a lot of data in them. To reduce CPU usage, I am using the Reindex API POST _reindex with the parameter request_per_second. But during the reindexing, if some newer data comes then it can be managed by alias, but what about the updated data how can i transfer or reflect the updated data to newer index after the reindexing.Will elastic manage the updated data that can be transferred to the newer index?

I had one approach, but I don't know if it would work during reindexing if I paused the data and resumed it after reindexing.

Is there any better approach that will not cause data loss, with zero downtime, but with minimal delay?

I would appreciate your advice on this matter.

Christian_Dahlqvist · September 4, 2024, 8:57am

No. The data that exists at the time the reindexing job is initiated is what will be reindexed.

Pausing updates during the reindexing and then switching to the new indides is the safest way but does result in potentially long downtime.

If you have (or added) a last updated timestamp on the documents (e.g. through an ingest pipeline that sets it to when the document actually reached Elasticsearch) you may be able to perform an initial reindexing of the bulk of the data. Once that is complete you could then stop ingestion and run a separate reindexing job to catch all updates based on the update timestamp before switching over to the new index. This would potentially reduce downtime but may not catch deletes, so could lead to some inconsistencies. Might be worth testing though.

shanay_20 · September 11, 2024, 4:54am

I don't want to stop ingestion of data, as it is a never-ending process.Is there any other way rather than way?

Christian_Dahlqvist · September 11, 2024, 5:38am

I think you need to stop it at some point if you do not want to risk losing data, but you can limit the downtime by copying over data in advance so you only have a limited amount of data that need to reindexed during the window.

Topic		Replies	Views
Reindexing with zero downtime - update document Elasticsearch	6	411	May 27, 2024
Zero Downtime Reindex in both a Read and Write Heavy Environment Elasticsearch	1	916	June 26, 2019
Zero Downtime Reindexing Elasticsearch	9	3302	July 6, 2017
Reindexing in Kibana or other ELK stack product Elasticsearch	7	826	February 22, 2018
Change index mapping with minimal downtime Elasticsearch	1	348	October 30, 2019

Reindex Approach with zero downtime but with minimum delay

Related topics