Zero Downtime Reindexing


(Andrew Kane) #1

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach - while
reindexing, changes may not make it to the new index. I've looked all over
and haven't found a single solution to address this. The best attempt I've
seen is to buffer updates, but this is tedious and still leaves a race
condition (with a smaller window). My initial thoughts were to create a
write alias that points to the old and new indices and use versioning.
However, there is no way to write to multiple indices atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f633b7d6-67a6-464f-b0ad-fe478ae85cc6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

Zero downtime works by using the atomic switch in the index alias setting.

Here is an example, which also allows to decommission nodes for maintenance.

  1. your index I1 with data is distributed on node (groups) N1 and N2
  2. create an alias A for I1
  3. direct your search API to alias A
  4. create a new index I2 only on N1. I2 may have other parameters than I1,
    the old index.
  5. you may decommission N2 now. All shards of I1 move to N1 automatically.
  6. direct your (re-)indexing API to I2, and index unless I2 contains all
    the docs of I1
  7. switch alias A from I1 to I2. This is atomic.
  8. you may drop index I1 now
  9. maintain N2 (update software, hardware)
  10. let N2 rejoin and disable decommissioning of N2
  11. index I2 distributes shards over N1 and N2

Does this help?

Jörg

On Wed, Feb 19, 2014 at 5:41 AM, Andrew Kane acekane1@gmail.com wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach - while
reindexing, changes may not make it to the new index. I've looked all over
and haven't found a single solution to address this. The best attempt I've
seen is to buffer updates, but this is tedious and still leaves a race
condition (with a smaller window). My initial thoughts were to create a
write alias that points to the old and new indices and use versioning.
However, there is no way to write to multiple indices atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f633b7d6-67a6-464f-b0ad-fe478ae85cc6%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcK-ddBULqZWQ03bme%3DQVFTzoWtiZ43DiuzErpM606qQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #3

Here is how I do it:

  1. Have index called foo_1392831890 with alias foo pointing to it
  2. Create index called foo_1392841890 with new config
  3. Scan/scroll everything from the foo alias into foo_1392841890.
  4. Swap alias. Time has now warped backwards.
  5. Run script to reindex everything that happened since I created
    foo_1392841890 from the system of record.

If you happen to use jobs to update your index you can pause them during
this process which would prevent things from going back in time. They'd
just stall instead.

Another option is to index into both indexes once they exist. At this
point you'd have to do it manually. I imagine that'd actually be a nice
feature for Elasticsearch to add though.

Nik

On Wed, Feb 19, 2014 at 5:15 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Zero downtime works by using the atomic switch in the index alias setting.

Here is an example, which also allows to decommission nodes for
maintenance.

  1. your index I1 with data is distributed on node (groups) N1 and N2
  2. create an alias A for I1
  3. direct your search API to alias A
  4. create a new index I2 only on N1. I2 may have other parameters than I1,
    the old index.
  5. you may decommission N2 now. All shards of I1 move to N1 automatically.
  6. direct your (re-)indexing API to I2, and index unless I2 contains all
    the docs of I1
  7. switch alias A from I1 to I2. This is atomic.
  8. you may drop index I1 now
  9. maintain N2 (update software, hardware)
  10. let N2 rejoin and disable decommissioning of N2
  11. index I2 distributes shards over N1 and N2

Does this help?

Jörg

On Wed, Feb 19, 2014 at 5:41 AM, Andrew Kane acekane1@gmail.com wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f633b7d6-67a6-464f-b0ad-fe478ae85cc6%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcK-ddBULqZWQ03bme%3DQVFTzoWtiZ43DiuzErpM606qQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd01SwqXwHxEj76GcM3yo6jwqBpTFbn4VutGzKGQv_h41Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Matthias Richter) #4

Nik,

that actually what Clinton Gormley responded to me on twitter when I asked
the same question. I believe it is the way to go, but still we have that
little time frame between your steps 4 and 5. A request asking for a
document on foo may lead to no results when this document has been indexed
while doing step 3. Do you agree?

Matthias

Am Mittwoch, 19. Februar 2014 21:35:56 UTC+1 schrieb Nikolas Everett:

Here is how I do it:

  1. Have index called foo_1392831890 with alias foo pointing to it
  2. Create index called foo_1392841890 with new config
  3. Scan/scroll everything from the foo alias into foo_1392841890.
  4. Swap alias. Time has now warped backwards.
  5. Run script to reindex everything that happened since I created
    foo_1392841890 from the system of record.

If you happen to use jobs to update your index you can pause them during
this process which would prevent things from going back in time. They'd
just stall instead.

Another option is to index into both indexes once they exist. At this
point you'd have to do it manually. I imagine that'd actually be a nice
feature for Elasticsearch to add though.

Nik

On Wed, Feb 19, 2014 at 5:15 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

Zero downtime works by using the atomic switch in the index alias setting.

Here is an example, which also allows to decommission nodes for
maintenance.

  1. your index I1 with data is distributed on node (groups) N1 and N2
  2. create an alias A for I1
  3. direct your search API to alias A
  4. create a new index I2 only on N1. I2 may have other parameters than
    I1, the old index.
  5. you may decommission N2 now. All shards of I1 move to N1 automatically.
  6. direct your (re-)indexing API to I2, and index unless I2 contains all
    the docs of I1
  7. switch alias A from I1 to I2. This is atomic.
  8. you may drop index I1 now
  9. maintain N2 (update software, hardware)
  10. let N2 rejoin and disable decommissioning of N2
  11. index I2 distributes shards over N1 and N2

Does this help?

Jörg

On Wed, Feb 19, 2014 at 5:41 AM, Andrew Kane <acek...@gmail.com<javascript:>

wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f633b7d6-67a6-464f-b0ad-fe478ae85cc6%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcK-ddBULqZWQ03bme%3DQVFTzoWtiZ43DiuzErpM606qQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c13c1dbd-1666-4120-8309-8672f0476996%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Andrew Kane) #5

I tried to post a reply yesterday but it looks like it never made it.

Thank you all for the quick replies. Here's a slightly better explanation
of where I believe the race condition occurs.

When the scan/scroll starts, the alias is still pointing to the old index,
so updates go to the old index. Let's say you update Document 1. If the
scroll/scan has already passed Document 1, the new index never sees the
update. The three solutions you mentioned Nik are to either:

  1. Keep track of updates manually [tedious]
  2. Pause the jobs that perform the updates [out of sync]
  3. Send updates to both indexes [also tedious]

However, none of these seem ideal.

  • Andrew

On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach - while
reindexing, changes may not make it to the new index. I've looked all over
and haven't found a single solution to address this. The best attempt I've
seen is to buffer updates, but this is tedious and still leaves a race
condition (with a smaller window). My initial thoughts were to create a
write alias that points to the old and new indices and use versioning.
However, there is no way to write to multiple indices atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8efce1a5-980b-4240-8bb5-6217071e1540%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Matthias Richter) #6

I'm still thinking of manual updates or updating both indexes. Job pauses
are acceptable but zero downtime clearly means read only then.

I'm curious if there is a clean solution or if something is planned here
for the next releases?

Matthias

2014-02-21 9:11 GMT+01:00 Andrew Kane acekane1@gmail.com:

I tried to post a reply yesterday but it looks like it never made it.

Thank you all for the quick replies. Here's a slightly better explanation
of where I believe the race condition occurs.

When the scan/scroll starts, the alias is still pointing to the old index,
so updates go to the old index. Let's say you update Document 1. If the
scroll/scan has already passed Document 1, the new index never sees the
update. The three solutions you mentioned Nik are to either:

  1. Keep track of updates manually [tedious]
  2. Pause the jobs that perform the updates [out of sync]
  3. Send updates to both indexes [also tedious]

However, none of these seem ideal.

  • Andrew

On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great. http://www.elasticsearch.org/blog/changing-mapping-with-
zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/s4RHF7qk7p8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8efce1a5-980b-4240-8bb5-6217071e1540%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Matthias Richter
Pregelstraße 14
53127 Bonn

0228 33 600 997
0171 4724384

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BbHfFsuCT_aearEwWoULZoKKM%2BzDLwk34YQFdmGNcXk6wU4jw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(José de Zárate) #7

How about, while the scan is being done, let updates go to the old index
but with an extra field? Once the alias points to the new index, it's just
a query to fetch the fields with that new field from the old index and then
reindex then into the new one. If the alias changing/new index creation is
unsuccessful , then update old index to remove that new field.

On Friday, February 21, 2014 3:11:52 AM UTC-5, Andrew Kane wrote:

I tried to post a reply yesterday but it looks like it never made it.

Thank you all for the quick replies. Here's a slightly better explanation
of where I believe the race condition occurs.

When the scan/scroll starts, the alias is still pointing to the old index,
so updates go to the old index. Let's say you update Document 1. If the
scroll/scan has already passed Document 1, the new index never sees the
update. The three solutions you mentioned Nik are to either:

  1. Keep track of updates manually [tedious]
  2. Pause the jobs that perform the updates [out of sync]
  3. Send updates to both indexes [also tedious]

However, none of these seem ideal.

  • Andrew

On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5eff28f1-aec6-4fd1-b52d-168191e1de30%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jim Abramson) #8

Agree with the original poster that none of the existing solutions are
ideal. Making it simpler and safer to roll out revised mappings would be a
huge win if your use case involves incremental revisions/refinements to
your indexing strategies. A lossless solution would especially benefit the
case where ES is being used as the primary data source (an option we have
been considering), since you really don't want to drop a record in that
case.

On Monday, February 24, 2014 9:20:56 AM UTC-5, JoeZ99 wrote:

How about, while the scan is being done, let updates go to the old index
but with an extra field? Once the alias points to the new index, it's just
a query to fetch the fields with that new field from the old index and then
reindex then into the new one. If the alias changing/new index creation is
unsuccessful , then update old index to remove that new field.

On Friday, February 21, 2014 3:11:52 AM UTC-5, Andrew Kane wrote:

I tried to post a reply yesterday but it looks like it never made it.

Thank you all for the quick replies. Here's a slightly better
explanation of where I believe the race condition occurs.

When the scan/scroll starts, the alias is still pointing to the old
index, so updates go to the old index. Let's say you update Document 1. If
the scroll/scan has already passed Document 1, the new index never sees the
update. The three solutions you mentioned Nik are to either:

  1. Keep track of updates manually [tedious]
  2. Pause the jobs that perform the updates [out of sync]
  3. Send updates to both indexes [also tedious]

However, none of these seem ideal.

  • Andrew

On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/670c8443-3706-4dd0-a57d-d2e9fcac9ce1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Rob Ottaway) #9

Short of using a river to feed both indexes the same stream of updates
updates I doubt that you will find any solution. Good news is why it's
tedious as you pointed out, once setup it flows very smoothly. We use
RabbitMQ in our case.

A possible future feature would be an API within index creation that allows
shadowing the indexing of one or more other indexes, without having to go
through duplication client side or using a river. When the shadowed index
goes away so could the shadowing, or api call could delete the shadow.

cheers,
Rob

On Friday, February 21, 2014 12:11:52 AM UTC-8, Andrew Kane wrote:

I tried to post a reply yesterday but it looks like it never made it.

Thank you all for the quick replies. Here's a slightly better explanation
of where I believe the race condition occurs.

When the scan/scroll starts, the alias is still pointing to the old index,
so updates go to the old index. Let's say you update Document 1. If the
scroll/scan has already passed Document 1, the new index never sees the
update. The three solutions you mentioned Nik are to either:

  1. Keep track of updates manually [tedious]
  2. Pause the jobs that perform the updates [out of sync]
  3. Send updates to both indexes [also tedious]

However, none of these seem ideal.

  • Andrew

On Tuesday, February 18, 2014 8:41:18 PM UTC-8, Andrew Kane wrote:

Hi,

I've followed the documentation for zero-downtime mapping changes and it
works great.
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

However, there is a (pretty big) race condition with this approach -
while reindexing, changes may not make it to the new index. I've looked
all over and haven't found a single solution to address this. The best
attempt I've seen is to buffer updates, but this is tedious and still
leaves a race condition (with a smaller window). My initial thoughts were
to create a write alias that points to the old and new indices and use
versioning. However, there is no way to write to multiple indices
atomically.

It seems like this issue should affect most Elasticsearch users (whether
they realize it or not). Does anyone have a good solution to this?

Thanks,
Andrew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01041af1-3032-41d0-9b80-c6861aa3d6dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #10