How to Replace node


How do I replace a node with the same IP?

Here is my situation:

I was having 2 node cluster

node-1 with HOT data(was acting as master before it went down), node-2 with WARM data nodes and because of hardware issue node-1 went down and node-2 became a master node.

Now cluster showing as only one data node.

 curl -s -k http://node-2:9200/_cluster/health?pretty
  "cluster_name" : "elk-test",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 49,
  "active_shards" : 49,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 47,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 51.041666666666664

I am trying to add node-1 into the cluster after hardware issue got fixed but I am running into below error and I even after I removing data directory files.

[2019-05-21T09:50:42,648][INFO ][o.e.d.z.ZenDiscovery     ] [elk-1] failed to send join request to master [{elk-2}{XZlIzGZkSe2BE_4LKNaOcQ}{Wrz-4NzYTbGENWDJEJXDVA}{}{}], reason [RemoteTransportException[[elk-2][10.1.x.x.:9300][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {elk-1}{XZlIzGZkSe2BE_4LKNaOcQ}{X_T3K6qUTaebfWn6xwo9jg}{10.1.x.x1}{10.1.x.x1:9300}{box_type=hot}, found existing node {elk-2}{XZlIzGZkSe2BE_4LKNaOcQ}{Wrz-4NzYTbGENWDJEJXDVA}{10.1.x.x.}{10.1.x.x.:9300} with the same id but is a different node instance]; ]
[2019-05-21T09:50:45,742][INFO ][o.e.d.z.ZenDiscovery     ] [elk-1] failed to send join request to master [{elk-2}{XZlIzGZkSe2BE_4LKNaOcQ}{Wrz-4NzYTbGENWDJEJXDVA}{10.1.x.x.}{10.1.x.x.:9300}], reason [RemoteTransportException[[elk-2][10.1.x.x.:9300][internal:discovery/zen/join]]; nested: IllegalArgumentException[can't add node {elk-1}{XZlIzGZkSe2BE_4LKNaOcQ}{X_T3K6qUTaebfWn6xwo9jg}{10.1.x.x1}{10.1.x.x1:9300}{box_type=hot}, found existing node {elk-2}{XZlIzGZkSe2BE_4LKNaOcQ}{Wrz-4NzYTbGENWDJEJXDVA}{10.1.x.x.}{10.1.x.x.:9300} with the same id but is a different node instance]; ]
^C[2019-05-21T09:50:46,686][INFO ][o.e.n.Node               ] [elk-1] stopping ...
[2019-05-21T09:50:46,739][INFO ][o.e.n.Node               ] [elk-1] stopped
[2019-05-21T09:50:46,739][INFO ][o.e.n.Node               ] [elk-1] closing ...
[2019-05-21T09:50:46,751][INFO ][o.e.h.n.Netty4HttpServerTransport] [elk-1] publish_address {10.1.x.x1:9200}, bound_addresses {10.1.x.x1:9200}
[2019-05-21T09:50:46,751][INFO ][o.e.n.Node               ] [elk-1] started
[2019-05-21T09:50:46,797][INFO ][o.e.n.Node               ] [elk-1] closed
  1. even after removing all data files why I am seeing this error?
  2. Don't we have any option to replace a node instead of adding as fresh node? so that it will reduce data re balance overhead?


I recovered from that error, I was having two data mounts but I have deleted from only one data path after I remove from both places it added to the cluster.

can anyone help me on 2nd question?


That depends on if you have the ability to add the data from the removed node to the new one.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.