I'm working on a project where we are considering two approaches for zero-downtime re-indexing:
-
Create new index and begin re-indexing data into it. While re-indexing, new indexing requests get written to the new index and the old index and searches are run against an alias pointing to old index. Once re-indexing is complete, switch the search alias to point to the new index.
-
Create new index and temporary write index and begin re-indexing data into the former. While re-indexing, new indexing requests get written to the temporary write index and searches are run against an alias pointing to the old index and the temporary write index. Once re-indexing is complete, writes start happening to the new index, temporary index is copied over, and search alias is switched to point to the new index.
Approach #2 definitely seems more complex. The only reason we're considering it is because we're not sure if there might be any edge case race conditions if we attempt to write to the new index while re-indexing (approach #1). With approach #1, it looks like the new index is writeable while re-indexing is happening but is there any potential for conflict if we do write to it before the re-indexing finishes? Specifically, could there be problems in the case of an update (while re-indexing is happening) to a search document that already exists in the original index?