Another odd ES freak out

Grant · February 5, 2012, 10:48pm

So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]], reason [no longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]], previous [prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],}, reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])
[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]], reason [no longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]
[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVersion.java:
103)
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:
68)
at
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:
81)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:
198)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:
107)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:
470)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:
443)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
81)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:
351)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:
282)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:
202)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
181)
... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
181)
... 11 more

etc

Grant · February 5, 2012, 10:56pm

One ammedment: the cluster was indeed yellow while the node was having
issues.

On Feb 5, 5:48 pm, Grant gr...@brewster.com wrote:

So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]], reason [no longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]], previous [prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],}, reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])
[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]], reason [no longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]
[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVe rsion.java:
103)
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpV ersion.java:
68)
at
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.create Message(HttpRequestDecoder.java:
81)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode (HttpMessageDecoder.java:
198)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode (HttpMessageDecoder.java:
107)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDe code(ReplayingDecoder.java:
470)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messag eReceived(ReplayingDecoder.java:
443)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleU pstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChann elsHandler.java:
81)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channel s.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channel s.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker. java:
351)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelected Keys(NioWorker.java:
282)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.j ava:
202)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenami ngRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport .java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport .java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
181)
... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport .java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport .java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
181)
... 11 more

etc

Grant · February 5, 2012, 11:01pm

Also, these are 4gb nodes with 2gb of allocated heap. I noticed right
before this node went down the entire 2gb was used.

On Feb 5, 5:48 pm, Grant gr...@brewster.com wrote:

So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]], reason [no longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]], previous [prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],}, reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])
[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]], reason [no longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]
[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVe rsion.java:
103)
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpV ersion.java:
68)
at
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.create Message(HttpRequestDecoder.java:
81)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode (HttpMessageDecoder.java:
198)
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode (HttpMessageDecoder.java:
107)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDe code(ReplayingDecoder.java:
470)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messag eReceived(ReplayingDecoder.java:
443)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleU pstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChann elsHandler.java:
81)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream( DefaultChannelPipeline.java:
559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channel s.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channel s.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker. java:
351)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelected Keys(NioWorker.java:
282)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.j ava:
202)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenami ngRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport .java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport .java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
181)
... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
196)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
168)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQue ry(SearchServiceTransportAction.java:
140)
at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport .java:
636)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport .java:
448)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.j ava:
181)
... 11 more

etc

Vishal_Bhasin · April 16, 2014, 9:08pm

Hello - we are seeing the same issues, were you able to resolve this?
thanks!

On Sunday, 5 February 2012 16:48:25 UTC-6, Grant wrote:

So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]http://r08.ihost.brewster.com/10.180.48.255:9300]]],
reason [no longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]http://r04.ihost.brewster.com/10.180.35.110:9300]],
previous [prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],},
reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])

[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]http://r04.ihost.brewster.com/10.180.35.110:9300]]],
reason [no longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]http://prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]

[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]http://r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]

[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVersion.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:
at 
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:
at 
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:

at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:

at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:

at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:

at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:

at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:

at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/queryhttp://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]
    at 
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:

at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:

at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/queryhttp://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:

at org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:

at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more

etc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52d57a44-25b1-4d7c-8a03-d183959b40b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shobana_Neelakantan · October 3, 2014, 6:27am

We are seeing the same issue as well.
Were you able to resolve this?

On Thursday, April 17, 2014 2:38:06 AM UTC+5:30, Vishal Bhasin wrote:

Hello - we are seeing the same issues, were you able to resolve this?
thanks!

On Sunday, 5 February 2012 16:48:25 UTC-6, Grant wrote:
So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]
http://r08.ihost.brewster.com/10.180.48.255:9300]]], reason [no
longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]
http://r04.ihost.brewster.com/10.180.35.110:9300]], previous
[prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]
http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],},
reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]
http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])
[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]
http://r04.ihost.brewster.com/10.180.35.110:9300]]], reason [no
longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]
http://prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]

[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]
http://r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVersion.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:
at 
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:
at 
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:

at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:

at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:

at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:

at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:

at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:

at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:

at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query
http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]
    at 
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:

at
org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:

at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query
http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:

at
org.elasticsearch.action.search.type.TransportSearchCountAction
$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:

at org.elasticsearch.action.search.type.TransportSearchTypeAction
$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more

etc

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/87767368-7870-4e24-a33f-a618e2c045ba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jackpark · October 3, 2014, 4:57pm

I saw a few of those which I ultimately attributed to a flaky network
connection. That's not going to explain every such episode, but it did
solve one of mine.

On Thu, Oct 2, 2014 at 11:27 PM, Shobana Neelakantan s.shobana24@gmail.com
wrote:

We are seeing the same issue as well.
Were you able to resolve this?

On Thursday, April 17, 2014 2:38:06 AM UTC+5:30, Vishal Bhasin wrote:
Hello - we are seeing the same issues, were you able to resolve this?
thanks!

On Sunday, 5 February 2012 16:48:25 UTC-6, Grant wrote:
So we seem to be having recurring incidences of ES nodes getting into
a very odd state. In this particular case, one node because
unresponsive to test polls. I'm not really sure what to make of this,
because while this is ongoing, the cluster remains green, but the
borked node continues to try and service traffic, which means our app
is sporadically failing in the meantime.

[UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable
[UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health'
failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/
health] via TCP -- HTTP: Error receiving data -- Resource temporarily
unavailable

Here's the corresponding logs from the node in question:

Some of this:

[2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-
r08.ihost.brewster.com/10.180.48.255:9300]]
http://r08.ihost.brewster.com/10.180.48.255:9300]]], reason [no
longer
master]
[2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es-
r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]
http://r04.ihost.brewster.com/10.180.35.110:9300]], previous
[prod-es-r08]
[mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/
10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]
http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]],},
reason:
zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg]
[inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300]
http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300]])
[2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es-
r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-
r04.ihost.brewster.com/10.180.35.110:9300]]
http://r04.ihost.brewster.com/10.180.35.110:9300]]], reason [no
longer
master]
[2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es-
r07] not enough master nodes after master left (reason = no longer
master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg]
[inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300]
http://prod-es-r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r07]

[zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/
10.180.48.216:9300]],}
[2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es-
r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es-
r02.ihost.brewster.com/10.182.14.95:9300]
http://r02.ihost.brewster.com/10.182.14.95:9300]],[prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04]
[uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/
10.180.35.110:9300]])
[2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x09be0e53, /10.180.48.216:54645 => /
10.180.48.216:9200]
java.lang.IllegalArgumentException: empty text
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.(HttpVersion.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:
at
org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.
createMessage(HttpRequestDecoder.java:
81)
at
org.elasticsearch.common.netty.handler.codec.http.
HttpMessageDecoder.decode(HttpMessageDecoder.java:
198)
at
org.elasticsearch.common.netty.handler.codec.http.
HttpMessageDecoder.decode(HttpMessageDecoder.java:
107)
at
org.elasticsearch.common.netty.handler.codec.replay.
ReplayingDecoder.callDecode(ReplayingDecoder.java:
470)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.
messageReceived(ReplayingDecoder.java:
443)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.
handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendUpstream(DefaultChannelPipeline.java:
564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline

$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
783)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendUpstream(DefaultChannelPipeline.java:
564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendUpstream(DefaultChannelPipeline.java:
559)
at
org.elasticsearch.common.netty.channel.Channels.
fireMessageReceived(Channels.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.
fireMessageReceived(Channels.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.
NioWorker.processSelectedKeys(NioWorker.java:

at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:

at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:

at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker

$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
[2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es-
r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request
[scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;]
[2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es-
r07] Caught exception while handling client http traffic, closing
connection [id: 0x2cb594cf, /10.180.48.216:54651 => /
10.180.48.216:9200]

Followed by tons of this:

[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5c67fa3d]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query
http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]
    at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.
sendExecuteQuery(SearchServiceTransportAction.java:

at org.elasticsearch.action.search.type.TransportSearchCountAction

$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.
nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.
sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more
[2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es-
r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@718585ec]
org.elasticsearch.transport.SendRequestTransportException: [prod-es-
r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/
phase/query
http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/phase/query]

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

at
org.elasticsearch.search.action.SearchServiceTransportAction.
sendExecuteQuery(SearchServiceTransportAction.java:

at org.elasticsearch.action.search.type.TransportSearchCountAction

$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:
279)
at org.elasticsearch.action.search.type.TransportSearchTypeAction

$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211)
at org.elasticsearch.search.action.SearchServiceTransportAction
$2.handleException(SearchServiceTransportAction.java:151)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:199)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:

at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]]
Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.
nodeChannel(NettyTransport.java:

at
org.elasticsearch.transport.netty.NettyTransport.
sendRequest(NettyTransport.java:

at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

... 11 more

etc
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/87767368-7870-4e24-a33f-a618e2c045ba%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/87767368-7870-4e24-a33f-a618e2c045ba%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fybrLioU5OxgMnj4s5GCtzLzj8ZHuU7%2BMgm6%3Dp3KNwuZw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Loss of Connection between Nodes Elasticsearch	5	745	July 6, 2017
Warn which crashes server Elasticsearch	16	2162	July 6, 2017
Seeing Frequent NodeNotConnectedException errors Elasticsearch	4	12283	July 5, 2017
Nodes restarting automatically Elasticsearch	23	1547	July 6, 2017
Network failure resiliency Elasticsearch	14	766	July 6, 2017

Another odd ES freak out

Related topics