Hi,
I am running elasticsearch on docker and we are using it along with kibana as a reporting tool. We are using Nest API (.NET) to add/updates documents directly on indexes.
When there is high throughput from our service that performs crud on indexes. I see that some of the documents are not being indexed.
I have setup a three node cluster on docker all being master nodes with minimum_master_nodes=2. Below is the errors I see in the containers that might be causing the missing documents.
Node is losing connectivity with master (Our docker cluster network is unreliable right now). How do I increase the timeout, so that node has some more time before it errors out due to connectivity?
[2019-04-24T21:52:57,020][WARN ][o.e.a.b.TransportShardBulkAction] [node2] [[enrollmentrow-201904][4]] failed to perform indices:data/write/bulk[s] on replica [enrollmentrow-201904][4], node[QtzbCdG9Q7qs8y9pN6W0BQ], [R], s[STARTED], a[id=TbrQRZuzQ0Kuq93akLUVVw]
org.elasticsearch.transport.NodeDisconnectedException: [node1][10.0.88.54:9300][indices:data/write/bulk[s][r]] disconnected
[2019-04-24T21:52:57,020][WARN ][o.e.a.b.TransportShardBulkAction] [node2] [[enrollmentrow-201904][4]] failed to perform indices:data/write/bulk[s] on replica [enrollmentrow-201904][4], node[QtzbCdG9Q7qs8y9pN6W0BQ], [R], s[STARTED], a[id=TbrQRZuzQ0Kuq93akLUVVw]
org.elasticsearch.transport.NodeDisconnectedException: [node1][10.0.88.54:9300][indices:data/write/bulk[s][r]] disconnected
[2019-04-24T21:52:57,020][WARN ][o.e.a.b.TransportShardBulkAction] [node2] [[enrollmentrow-201904][4]] failed to perform indices:data/write/bulk[s] on replica [enrollmentrow-201904][4], node[QtzbCdG9Q7qs8y9pN6W0BQ], [R], s[STARTED], a[id=TbrQRZuzQ0Kuq93akLUVVw]
org.elasticsearch.transport.NodeDisconnectedException: [node1][10.0.88.54:9300][indices:data/write/bulk[s][r]] disconnected
[2019-04-24T21:52:57,021][WARN ][o.e.a.b.TransportShardBulkAction] [node2] [[modelvalidationresult-201904][4]] failed to perform indices:data/write/bulk[s] on replica [modelvalidationresult-201904][4], node[QtzbCdG9Q7qs8y9pN6W0BQ], [R], s[STARTED], a[id=LKWgHc2DSfuDGgovGUt0MA]
org.elasticsearch.transport.NodeDisconnectedException: [node1][10.0.88.54:9300][indices:data/write/bulk[s][r]] disconnected
[2019-04-24T21:52:57,020][WARN ][o.e.t.OutboundHandler ] [node2] send message failed [channel: Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:54360, remoteAddress=10.0.88.54/10.0.88.54:9300}]
java.io.IOException: No route to host
at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:?]
at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) ~[?:?]
at sun.nio.ch.IOUtil.write(IOUtil.java:182) ~[?:?]
at sun.nio.ch.IOUtil.write(IOUtil.java:130) ~[?:?]
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:496) ~[?:?]
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:420) ~[netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:938) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:367) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:650) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-04-24T21:52:57,021][WARN ][r.suppressed ] [node2] path: /modeltransformationresult-201904/dotomodeltransformationresult/33551ced-ac07-4d17-a44f-22e826d5ed25, params: {index=modeltransformationresult-201904, id=33551ced-ac07-4d17-a44f-22e826d5ed25, type=dotomodeltransformationresult}
org.elasticsearch.transport.NodeNotConnectedException: [node1][10.0.88.54:9300] Node not connected
at org.elasticsearch.transport.ConnectionManager.getConnection(ConnectionManager.java:151) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:556) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:528) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.performAction(TransportReplicationAction.java:872) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.performRemoteAction(TransportReplicationAction.java:846) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:812) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.7.1.jar:6.7.1]
[2019-04-24T21:52:57,022][WARN ][o.e.a.b.TransportShardBulkAction] [node2] [[enrollmentrow-201904][4]] failed to perform indices:data/write/bulk[s] on replica [enrollmentrow-201904][4], node[QtzbCdG9Q7qs8y9pN6W0BQ], [R], s[STARTED], a[id=TbrQRZuzQ0Kuq93akLUVVw]
org.elasticsearch.transport.NodeDisconnectedException: [node1][10.0.88.54:9300][indices:data/write/bulk[s][r]] disconnected
[2019-04-24T21:52:57,021][WARN ][o.e.d.z.UnicastZenPing ] [node2] failed to resolve host [[elasticsearch-1]