Reindexing data in a cluster serving live traffic

Faiz_Ahmed_Mushtak_H · February 10, 2020, 1:16pm

Hi,

We have a cluster today with 3 master, client and 5 data nodes which is serving live traffic
We are planning to expand the cluster to 15 data nodes because we are migrating another feature (which is pretty big with lots of data that needs to be migrated)

At first, we thought of adding the 10 extra nodes to the cluster and just ingest the data of the new "feature" through the client nodes itself. But we felt that this might affect the existing features that ES is serving because data nodes & client nodes are shared

What we then decided was that, the 10 new data nodes can be given some attribute and the new index gets allocated only to those 10 data nodes, the old features will be served using the older 5 nodes (we'll exclude the shards of these indices from getting allocated to the newer data nodes)

Now for ingestion, we'll directly hit the new data nodes and not go through the client nodes (or maybe add client nodes just for this ingestion)

Do you think this is fine? Or is this over engineering? Let me tell you, the new feature has almost 2TB of data so we're planning to ingest as fast as we can

system · March 9, 2020, 1:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES Ingestion Performance Issue Elasticsearch	2	302	March 15, 2019
Adding new data nodes to busy ingesting cluster Elasticsearch	3	204	October 17, 2022
Elasticsearch cluster design for heavy ingestion Elasticsearch	1	320	April 10, 2020
Added new nodes to the existing prod cluster Elasticsearch	3	463	February 16, 2018
Help On Scaling ES Elasticsearch	3	469	July 5, 2017

Reindexing data in a cluster serving live traffic

Related topics