Socket timeout when reindex-from-remote

I'm using the new reindex API to move from a 1.7 cluster to a 5.1 one but I'm frequently getting SocketTimeoutException. The problem is that when this happens ES blacklists the host and the reindex task fails since it's only possible to specify one remote host when starting the reindex.

Anything I can do to prevent this from happening? My current solution is to write a script to babysit the reindexing and restart jobs/slices when they timeout.

curl -XPOST localhost:9200/_reindex?slices=24 -d '{
    "source": {
        "index": "index_name",
        "remote": {
            "host": "http://remote-host:9200"
        }
    },
    "dest": {
        "index": "index_name"
    }
}'

ES log:

[2016-12-27T18:40:50,114][DEBUG][o.e.c.RestClient         ] request [POST http://remote-host.internal:9200/_search/scroll?scroll=5m] failed
java.net.SocketTimeoutException: null
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:375) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.client.InternalRequestExecutor.timeout(InternalRequestExecutor.java:116) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) [httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) [httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) [httpcore-nio-4.4.5.jar:4.4.5]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_92]
[2016-12-27T18:40:50,115][DEBUG][tracer                   ] curl -iX POST 'http://remote-host.internal:9200/_search/scroll?scroll=5m' -d 'c2NhbjsyNDsxNTU4MDU0OlJlQXZId0JWVExtUFV6NmxTVVJnU0E7MTU1ODA1NjpSZUF2SHdCVlRMbVBVejZsU1VSZ1NBOzE0MTI0NDUyNjpXQzBDWU5ESVN0S2F4MVJVUXhDTEZBOzEzNjU1NDYwODp2V1VaU2QtMVFLdS1aVHZ6NDFQN2NROzE1NTgwNTU6UmVBdkh3QlZUTG1QVXo2bFNVUmdTQTsxMzI5NTY1MDI6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxMjI4NjAwODA6Z1VldGgyMUZRMTJab05aVVQtQ3FWZzsxMzkxMTc0NzA6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzE1NzczMDM6Rm9wVGloeGtSTVctWmEtWEw0T2d1ZzsxMzkzNDgyODE6U1JEc2FTZUlRZEtxSWN0eXRRczctdzsxNDI2Mzk4NTE6SDhtQVBKejRUY3FvdmRNQ0VmMndFdzsxMzY1NTQ2MDc6dldVWlNkLTFRS3UtWlR2ejQxUDdjUTsxNDQxODk4NzI6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxMzkxMTc0NzI6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzkxMTc0NzE6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzE1NzczMDQ6Rm9wVGloeGtSTVctWmEtWEw0T2d1ZzsxNDQxODk4NzQ6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxNDQxODk4NzM6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxMjA3MjkwMjE6MDBtemF5a2FTLTJCdUZWdGlad1NXUTsxMzI5NTY1MDQ6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxMjI4NjAwODE6Z1VldGgyMUZRMTJab05aVVQtQ3FWZzsxMzkzNDgyODI6U1JEc2FTZUlRZEtxSWN0eXRRczctdzsxMzQyMjI5NDM6Um83Nkk0aXJSUWlXcWx4MEpielZuZzsxMzI5NTY1MDM6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxO3RvdGFsX2hpdHM6MTIyNzQ4Njg0Ow=='
[2016-12-27T18:40:50,115][DEBUG][o.e.c.RestClient         ] added host [http://remote-host.internal:9200] to blacklist
[2016-12-27T18:40:50,115][DEBUG][o.a.h.i.n.c.PoolingNHttpClientConnectionManager] Connection manager is shutting down

Is there anything in the remote cluster's logs?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.