I have two clients/apps using an elasticsearch instance and I want them to perform an async reindex task on an index in coordinated fashion so that they don't redo each others work (i.e., avoid both clients triggering a reindexing task at the same time or after a reindexing had completed). While I could coordinate between the clients, I'm trying to brainstorm ways without explicit coordination/synchronization to make it so both of them can detect that a reindex operation had already completed so that node A can perform the reindex operation and node B does nothing because it can detect that a reindexing is in progress by someone else or has already completed. Naively, the client/app that does nothing can simply check the Tasks API to see if the reindexing is in progress - that works no problem...
However, it's possible that node A finishes reindexing before node B checks the Tasks API, in which case there simply won't be a reindexing task because completed Tasks are not returned by the Tasks API. Then how will node B know that the reindexing operation had already occurred? If reindexing is not in progress and had not already occurred then in this situation the roles are swapped and node B ought to be the one to trigger reindexing and node A should wait.
When a client wants to reindex, query reindex-lock for an existing doc matching the source/destination index. If status == "in_progress", do nothing. Otherwise, insert a doc with status = "in_progress", using op_type=create to ensure only one client wins:
Because I do this reindexing every time I upgrade Elasticsearch major version so that the index stays within 2 major versions of the server, this index will need to get reindexed too and and maybe there's a bootstrapping issue - to reindex this index with multiple clients, it also needs a lock from a different index reindex-lock-2...and so on....
This seems like a correct approach but in terms of implementation/maintenance I'll have to:
create the index
for every ES major version upgrade we'll have to reindex it
deal with failure conditions like either ensuring that status becomes 'completed' or that we delete the lock; or what if the client/app creates the doc but crashes before triggering the reindex; or the opposite where the reindex is triggered but the client/app crashes before creating the doc
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.