[Reindexing] Is there any drawback to writing to a new index while re-indexing?


(Chris Gatihi) #1

I'm working on a project where we are considering two approaches for zero-downtime re-indexing:

  1. Create new index and begin re-indexing data into it. While re-indexing, new indexing requests get written to the new index and the old index and searches are run against an alias pointing to old index. Once re-indexing is complete, switch the search alias to point to the new index.

  2. Create new index and temporary write index and begin re-indexing data into the former. While re-indexing, new indexing requests get written to the temporary write index and searches are run against an alias pointing to the old index and the temporary write index. Once re-indexing is complete, writes start happening to the new index, temporary index is copied over, and search alias is switched to point to the new index.

Approach #2 definitely seems more complex. The only reason we're considering it is because we're not sure if there might be any edge case race conditions if we attempt to write to the new index while re-indexing (approach #1). With approach #1, it looks like the new index is writeable while re-indexing is happening but is there any potential for conflict if we do write to it before the re-indexing finishes? Specifically, could there be problems in the case of an update (while re-indexing is happening) to a search document that already exists in the original index?


(James Addison) #2

One thing to consider is the possibility that your mapping may radically change between the old and the new. You might add a new field, modify an existing field, remove a field. Application code that relies on your indexes storing documents in a specific way (either the new or the old index, depending on your approach could be 'broken' while you wait for the alias switch.

Much of this depends on how your application logic changes - how you approach this problem one time might not fit the next time.

One way to mitigate some of the effects mentioned is to run parallel application stacks during the reindex - a heavy-handed approach!

In the projects where I handle this, I (currently) accept the fact that nothing's perfect, and sometimes I may have downtime, depending on the level of application logic change. Most of the time, it's zero downtime though.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.