Best option for bulk refresh: alias blue/green vs single-index + version filter

alper · October 6, 2025, 11:39am

Hi everyone,

We develop an e-commerce search on Elasticsearch 8.x (Java client). Three indices: product, category, brand. ~50k products/day arrive via API (multiple times/day).

We want to add/update docs; remove products missing from the new dataset. If a field disappears in the source, it should be removed in ES. We need zero downtime / consistent reads during refresh. We can add Redis if a version flag helps

We have found two approaches:

Blue/Green with alias : Create a new index (same mappings/settings), bulk index the full new dataset, then atomically swap the read alias; drop the old index afterward.
Questions: pitfalls with alias swap & long-running PIT/scrolls? mapping changes between versions?
Single index + version field : Ingest new dataset with a higher version (store in each doc; app reads the current_version from Redis and adds a filter); later delete-by-query older versions.
Questions: prefer productId#version as _id to keep both versions, or overwrite same _id and rely on full upsert? any gotchas with DBQ at this scale?

Which approach would you pick and why for this volume and update pattern? Any bad stories or pitfalls to watch out for are very welcome.

Thank you.

stephenb · October 6, 2025, 2:51pm

Hi @alper

Big questions

As you seem to already be aware its all in the details....

I can say some of the biggest eComm customer I work with use some variety of your Option 1.

One customer does Blue / Green on the Client Side as well to Drain Out / Finish all the client request on the "Old/Blue" index before the swap. I believe they actually use 2 aliases that are abstracted at the client level to they can run the Blue / Green side by side as Blue Drains out... thus 0 down time, it requires and additional layer of abstraction.

My other customer does not have longer running PIT etc... they just use the normal 1 level of abstraction i.e. 1 alias and they do the "Quick Cutover" ... Pause the Queue, let requests finish, the switchover and continue.

This is not to say option 2 is not valid ... seems valid but a lot of "bookkeeping" that could get our of sync.

I have been using some version of 1 for 20 years with RDBMS it is a pretty common approach.

Let us know where you end up.

alper · October 15, 2025, 4:20pm

stephenb:

Big questions

As you seem to already be aware its all in the details....

I can say some of the biggest eComm customer I work with use some variety of your Option 1.

One customer does Blue / Green on the Client Side as well to Drain Out / Finish all the client request on the "Old/Blue" index before the swap. I believe they actually use 2 aliases that are abstracted at the client level to they can run the Blue / Green side by side as Blue Drains out... thus 0 down time, it requires and additional layer of abstraction.

My other customer does not have longer running PIT etc... they just use the normal 1 level of abstraction i.e. 1 alias and they do the "Quick Cutover" ... Pause the Queue, let requests finish, the switchover and continue.

This is not to say option 2 is not valid ... seems valid but a lot of "bookkeeping" that could get our of sync.

I have been using some version of 1 for 20 years with RDBMS it is a pretty common approach.

Let us know where you end up.

Thank you. We decided to go with Option 1 (new index + alias swap).

Main reasons:

decrease request time because version filter increases search time.
With a single index, while ingesting the new dataset the BM25 term stats (df/ttf) shift as docs arrive, which can update scores and decrease retrieval relevancy. The blue/green swap avoids that—queries see either all-old or all-new data.

Operationally we’ll bulk into the new index, refresh, then do an atomic alias cutover.

Appreciate the advice.

Topic		Replies	Views
Index updating/ refreshing for parallel applications Elasticsearch	2	403	July 6, 2017
Zero Downtime Reindexing Elasticsearch	9	3639	July 6, 2017
Reindexing Strategy Elasticsearch	3	1269	July 6, 2017
Designing an index that holds updating product data feeds Elasticsearch	6	796	July 6, 2017
Reindexing with new mapping Elasticsearch	14	3798	July 6, 2017

Best option for bulk refresh: alias blue/green vs single-index + version filter

Related topics