I'm using the new reindex
API to move from a 1.7 cluster to a 5.1 one but I'm frequently getting SocketTimeoutException. The problem is that when this happens ES blacklists the host and the reindex task fails since it's only possible to specify one remote host when starting the reindex.
Anything I can do to prevent this from happening? My current solution is to write a script to babysit the reindexing and restart jobs/slices when they timeout.
curl -XPOST localhost:9200/_reindex?slices=24 -d '{
"source": {
"index": "index_name",
"remote": {
"host": "http://remote-host:9200"
}
},
"dest": {
"index": "index_name"
}
}'
ES log:
[2016-12-27T18:40:50,114][DEBUG][o.e.c.RestClient ] request [POST http://remote-host.internal:9200/_search/scroll?scroll=5m] failed
java.net.SocketTimeoutException: null
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:375) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.client.InternalRequestExecutor.timeout(InternalRequestExecutor.java:116) [httpasyncclient-4.1.2.jar:4.1.2]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) [httpasyncclient-4.1.2.jar:4.1.2]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) [httpasyncclient-4.1.2.jar:4.1.2]
at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) [httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) [httpcore-nio-4.4.5.jar:4.4.5]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_92]
[2016-12-27T18:40:50,115][DEBUG][tracer ] curl -iX POST 'http://remote-host.internal:9200/_search/scroll?scroll=5m' -d 'c2NhbjsyNDsxNTU4MDU0OlJlQXZId0JWVExtUFV6NmxTVVJnU0E7MTU1ODA1NjpSZUF2SHdCVlRMbVBVejZsU1VSZ1NBOzE0MTI0NDUyNjpXQzBDWU5ESVN0S2F4MVJVUXhDTEZBOzEzNjU1NDYwODp2V1VaU2QtMVFLdS1aVHZ6NDFQN2NROzE1NTgwNTU6UmVBdkh3QlZUTG1QVXo2bFNVUmdTQTsxMzI5NTY1MDI6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxMjI4NjAwODA6Z1VldGgyMUZRMTJab05aVVQtQ3FWZzsxMzkxMTc0NzA6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzE1NzczMDM6Rm9wVGloeGtSTVctWmEtWEw0T2d1ZzsxMzkzNDgyODE6U1JEc2FTZUlRZEtxSWN0eXRRczctdzsxNDI2Mzk4NTE6SDhtQVBKejRUY3FvdmRNQ0VmMndFdzsxMzY1NTQ2MDc6dldVWlNkLTFRS3UtWlR2ejQxUDdjUTsxNDQxODk4NzI6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxMzkxMTc0NzI6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzkxMTc0NzE6cWRYS0prZUlSSGVOMWZhVWJjZWFSQTsxMzE1NzczMDQ6Rm9wVGloeGtSTVctWmEtWEw0T2d1ZzsxNDQxODk4NzQ6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxNDQxODk4NzM6V29BcFFPdDlTcEc1SnhNVE91M2gyUTsxMjA3MjkwMjE6MDBtemF5a2FTLTJCdUZWdGlad1NXUTsxMzI5NTY1MDQ6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxMjI4NjAwODE6Z1VldGgyMUZRMTJab05aVVQtQ3FWZzsxMzkzNDgyODI6U1JEc2FTZUlRZEtxSWN0eXRRczctdzsxMzQyMjI5NDM6Um83Nkk0aXJSUWlXcWx4MEpielZuZzsxMzI5NTY1MDM6UVhYNUNPSWdSWFdxRE5VNkRBWDJsQTsxO3RvdGFsX2hpdHM6MTIyNzQ4Njg0Ow=='
[2016-12-27T18:40:50,115][DEBUG][o.e.c.RestClient ] added host [http://remote-host.internal:9200] to blacklist
[2016-12-27T18:40:50,115][DEBUG][o.a.h.i.n.c.PoolingNHttpClientConnectionManager] Connection manager is shutting down