Remote reindex No route to host error despite remote cluster returning _cluster/health info

fidsamuraiCrelio · August 1, 2024, 7:07am

Hi All!

We use Elasticsearch 6.8.0 as a search engine for our application logs.
Unfortunately our cluster crashed and some indices were unrecoverable.
Since the cluster was in red state we spun up a new cluster so that logs could start again.

We've setup the new cluster with the reindex.remote.whitelist: setting.
Initially we were able to migrate around 4-5k indices from the Old cluster.

However we need to migrate a few more but we've started getting the error -
No route to host

Full error -

{
  "error": {
    "root_cause": [
      {
        "type": "no_route_to_host_exception",
        "reason": "No route to host"
      }
    ],
    "type": "no_route_to_host_exception",
    "reason": "No route to host"
  },
  "status": 500
}

Reindex request -

POST _reindex?wait_for_completion=true&refresh
{
  "source": {
    "remote": {
      "host": "http://old-cluster:80",
      "socket_timeout": "1m",
      "connect_timeout": "1m"
    },
    "index": "index"
  },
  "dest": {
    "index": "index"
  }
}

These are all the logs I could find -

[2024-08-01T05:40:18,597][WARN ][r.suppressed             ] [titanNew-01] path: /_reindex, params: {wait_for_completion=true}
java.net.NoRouteToHostException: No route to host
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) ~[?:?]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) ~[?:?]
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) ~[?:?]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[?:?]
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[?:?]
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[?:?]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_412]

Someone please help with this.

DavidTurner · August 1, 2024, 7:56am

I'm pretty sure that No route to host is an error message coming directly from the OS, you likely need to involve your local network experts. Can't be certain without looking at the code tho, and 6.8.0 is far too old for that. Does it reproduce on a version that isn't EOL?

fidsamuraiCrelio · August 1, 2024, 8:42am

Hi David,

The weird thing is that if I curl the old cluster it's successful.
This is from a VM in the new Cluster -

root@ip-192-168-31-250:~# curl titan-old.search.com/_cluster/health?pretty
{
  "cluster_name" : "Titan",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 7,
  "number_of_data_nodes" : 6,
  "active_primary_shards" : 8474,
  "active_shards" : 8926,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 3879,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 69.70714564623194
}

Unfortunately we can't test this with a newer version since 7.x.x breaks a lot of things for us.

Let me know if you'd like me to test anything else.

Thanks!

DavidTurner · August 1, 2024, 9:03am

I have no other suggestions, sorry.

fidsamuraiCrelio · August 1, 2024, 9:20am

@DavidTurner thanks!

Anyone else who you could refer to help with this?

DavidTurner · August 1, 2024, 9:33am

No I don't think anyone else would look into issues with such an old version either.

fidsamuraiCrelio · August 2, 2024, 4:25am

Thanks anyways @DavidTurner !

fidsamuraiCrelio · August 7, 2024, 4:11am

For anyone else struggling with something similar a decent workaround is -
elasticdump

It's a NodeJS based tool that migrates data and mappings as needed.

Not sure if this is approved by the Elastic Stack team though

Topic		Replies	Views
What causes "no route host" error while trying to Reindex from a remote cluster? Elasticsearch	1	476	July 30, 2020
Reindex remote cluster connection timeout Elasticsearch reindex	4	376	April 8, 2024
Can not reindex to another host Elasticsearch	5	2076	June 5, 2019
Our cluster crushed when a node cannot reached Elasticsearch	1	454	July 5, 2017
Reindex from Remote 1.7 to 5.5 Transport Exception Elasticsearch	1	560	October 17, 2017

Remote reindex No route to host error despite remote cluster returning _cluster/health info

Related topics