Nodes restarting automatically


(Jorge Ferrando) #1

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what
can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the
green state and suddenly node 3 autorestarted and now cluster is red with
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else like
the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 22 May 2014 22:07, Jorge Ferrando jorfermo@gmail.com wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what
can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the
green state and suddenly node 3 autorestarted and now cluster is red with
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #3

elasticsearch nodes are launched through /etc/init.d/elasticsearch

On Thu, May 22, 2014 at 2:13 PM, Mark Walkom markw@campaignmonitor.comwrote:

How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else like
the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 22 May 2014 22:07, Jorge Ferrando jorfermo@gmail.com wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the
green state and suddenly node 3 autorestarted and now cluster is red with
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Nik Everett) #4

Like Mark said, check the oomkiller. It should log to syslog. Its is evil.

Nik

On Thu, May 22, 2014 at 2:14 PM, Jorge Ferrando jorfermo@gmail.com wrote:

elasticsearch nodes are launched through /etc/init.d/elasticsearch

On Thu, May 22, 2014 at 2:13 PM, Mark Walkom markw@campaignmonitor.comwrote:

How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else like
the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 22 May 2014 22:07, Jorge Ferrando jorfermo@gmail.com wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0f6kHa%2BFPofN%2BGwkNzhEsPT7HwOnW-95PJhaor9NprhA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #5

I've been checking syslog in all of the nodes and I found no mention to
oom, process killed, out of memory or something similar...

Just in caes I ran this commands in the 3 nodes and the problem persists:

echo "0" > /proc/sys/vm/oom-kill
echo 1 > /proc/sys/vm/overcommit_memory
echo 100 > /proc/sys/vm/overcommit_ratio

On Thu, May 22, 2014 at 2:16 PM, Nikolas Everett nik9000@gmail.com wrote:

Like Mark said, check the oomkiller. It should log to syslog. Its is
evil.

Nik

On Thu, May 22, 2014 at 2:14 PM, Jorge Ferrando jorfermo@gmail.comwrote:

elasticsearch nodes are launched through /etc/init.d/elasticsearch

On Thu, May 22, 2014 at 2:13 PM, Mark Walkom markw@campaignmonitor.comwrote:

How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else like
the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 22 May 2014 22:07, Jorge Ferrando jorfermo@gmail.com wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0f6kHa%2BFPofN%2BGwkNzhEsPT7HwOnW-95PJhaor9NprhA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAPmjWd0f6kHa%2BFPofN%2BGwkNzhEsPT7HwOnW-95PJhaor9NprhA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5A4g03-pdeVCVZhJsUBs7dCH%3DpvkR%2BUDpbrtQKrX%2BMTnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(emeschitc) #6

Hi,

I may be wrong but it seems to me you have a problem with your network. It
may be a flaky connection, broken nic or something wrong with your
configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <
ml-node+s115913n4056276h48@n3.nabble.com> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what
can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the
green state and suddenly node 3 autorestarted and now cluster is red with
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4056276&i=0
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=115913&code=ZW1lc2NoaXRjQGdtYWlsLmNvbXwxMTU5MTN8LTExODcwOTk0NDI=
.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


(Jorge Ferrando) #7

Different message in the log aftere another crash:

[2014-05-23 14:17:11,580][WARN ][transport.netty ] [elastic ASIC
nodo 3] exception caught on transport layer [[id: 0xc5d07c82, /
158.42.250.192:59864 :> /158.42.250.79:9301]], closing connection
java.io.IOException: Conexión reinicializada por la máquina remota
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

On Thu, May 22, 2014 at 2:34 PM, Jorge Ferrando jorfermo@gmail.com wrote:

I've been checking syslog in all of the nodes and I found no mention to
oom, process killed, out of memory or something similar...

Just in caes I ran this commands in the 3 nodes and the problem persists:

echo "0" > /proc/sys/vm/oom-kill
echo 1 > /proc/sys/vm/overcommit_memory
echo 100 > /proc/sys/vm/overcommit_ratio

On Thu, May 22, 2014 at 2:16 PM, Nikolas Everett nik9000@gmail.comwrote:

Like Mark said, check the oomkiller. It should log to syslog. Its is
evil.

Nik

On Thu, May 22, 2014 at 2:14 PM, Jorge Ferrando jorfermo@gmail.comwrote:

elasticsearch nodes are launched through /etc/init.d/elasticsearch

On Thu, May 22, 2014 at 2:13 PM, Mark Walkom markw@campaignmonitor.comwrote:

How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else
like the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 22 May 2014 22:07, Jorge Ferrando jorfermo@gmail.com wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some
way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624aL0xXsEF4qbtYH82%3DgmhpQJZYFn3xk_R5ryiZOeZCF_Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D5qPB%2BqcM9QM1Leiw8WJv27vhPb4emirQy3uYrqWsRvA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0f6kHa%2BFPofN%2BGwkNzhEsPT7HwOnW-95PJhaor9NprhA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAPmjWd0f6kHa%2BFPofN%2BGwkNzhEsPT7HwOnW-95PJhaor9NprhA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5AiorKx1rtf5OznszjwVnvp3Q1RkCLnuAy%3D9Mm14ctJXw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #8

I thought about that but It would be strange because they are 3 Virtual
Machines in the same VMWare cluster with other hundreds of services and
nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:

Hi,

I may be wrong but it seems to me you have a problem with your network. It
may be a flaky connection, broken nic or something wrong with your
configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users]
<[hidden email] http://user/SendEmail.jtp?type=node&node=4056287&i=0>wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC
nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the
green state and suddenly node 3 autorestarted and now cluster is red with
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4056276&i=0
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automaticallyhttp://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/at Nabble.com.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5AFF3J4Z85FpbYTDGyeirv5Rwg33LE-8XfbKqm%3DaL%2B5sA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #9

I've been analyzing the problem with Marvel and nagios and I managed to get
2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it
happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total
[745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young]
[421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total
[29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC
nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC
nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:

I thought about that but It would be strange because they are 3 Virtual
Machines in the same VMWare cluster with other hundreds of services and
nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:

Hi,

I may be wrong but it seems to me you have a problem with your network.
It may be a flaky connection, broken nic or something wrong with your
configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users]
<[hidden email] http://user/SendEmail.jtp?type=node&node=4056287&i=0>
wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #10

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state. Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] [5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:
I thought about that but It would be strange because they are 3 Virtual Machines in the same VMWare cluster with other hundreds of services and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:
Hi,

I may be wrong but it seems to me you have a problem with your network. It may be a flaky connection, broken nic or something wrong with your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <[hidden email]> wrote:
Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes restarts randomly and cluster gets to red state, then yellow, then green and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what can be going on

I've found entries like these, but I don't know if they are in some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the green state and suddenly node 3 autorestarted and now cluster is red with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML

View this message in context: Re: Nodes restarting automatically
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #11

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before
a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests
and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to
get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it
happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total
[745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young]
[421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total
[29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC
nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC
nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3 Virtual
Machines in the same VMWare cluster with other hundreds of services and
nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:

Hi,

I may be wrong but it seems to me you have a problem with your network.
It may be a flaky connection, broken nic or something wrong with your
configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some way
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #12

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on old GC.

So memory pressure is the first guess. You may have too many old GC cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:
GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state. Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] [5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:
I thought about that but It would be strange because they are 3 Virtual Machines in the same VMWare cluster with other hundreds of services and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:
Hi,

I may be wrong but it seems to me you have a problem with your network. It may be a flaky connection, broken nic or something wrong with your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <[hidden email]> wrote:
Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes restarts randomly and cluster gets to red state, then yellow, then green and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what can be going on

I've found entries like these, but I don't know if they are in some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the green state and suddenly node 3 autorestarted and now cluster is red with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML

View this message in context: Re: Nodes restarting automatically
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #13

This is what Marvel shows for old GC in the last 6 hours for that node:

[image: Inline image 1]

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on
old GC.

So memory pressure is the first guess. You may have too many old GC cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting
before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your
requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to
get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time
it happened at around 9:47 the cluster became green and 9:50 the node
restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total
[745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young]
[421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total
[29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC
nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC
nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC
nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3 Virtual
Machines in the same VMWare cluster with other hundreds of services and
nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:

Hi,

I may be wrong but it seems to me you have a problem with your network.
It may be a flaky connection, broken nic or something wrong with your
configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some
way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #14

I think but might be wrong that this node as unresponsive does not collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:
It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on old GC.

So memory pressure is the first guess. You may have too many old GC cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:
GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state. Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] [5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:
I thought about that but It would be strange because they are 3 Virtual Machines in the same VMWare cluster with other hundreds of services and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:
Hi,

I may be wrong but it seems to me you have a problem with your network. It may be a flaky connection, broken nic or something wrong with your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <[hidden email]> wrote:
Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes restarts randomly and cluster gets to red state, then yellow, then green and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what can be going on

I've found entries like these, but I don't know if they are in some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the green state and suddenly node 3 autorestarted and now cluster is red with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML

View this message in context: Re: Nodes restarting automatically
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #15

What could be the cause of that? Any update of elasticsearch? Any
configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:

I think but might be wrong that this node as unresponsive does not collect
anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on
old GC.

So memory pressure is the first guess. You may have too many old GC
cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting
before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your
requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to
get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time
it happened at around 9:47 the cluster became green and 9:50 the node
restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s],
total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools
{[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s],
total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic
ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ,
bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic
ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3 Virtual
Machines in the same VMWare cluster with other hundreds of services and
nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:

Hi,

I may be wrong but it seems to me you have a problem with your
network. It may be a flaky connection, broken nic or something wrong with
your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of
what can be going on

I've found entries like these, but I don't know if they are in some
way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to
the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5DKjtFZhmhdGXzYec-43KZVuL-3vKMCd4%2BZb1yhFMEcyQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #16

There are recent entries in the log (like 15 mins ago) about gc/young/old

[2014-05-29 13:37:34,183][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][38][5] duration [763ms], collections [1]/[1s], total
[763ms]/[2.3s], memory [609.6mb]->[166.3mb]/[29.9gb], all_pools {[young]
[528.7mb]->[29.8mb]/[532.5mb]}{[survivor]
[64.3mb]->[66.5mb]/[66.5mb]}{[old] [16.5mb]->[69.9mb]/[29.3gb]}
[2014-05-29 13:51:17,798][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][846][205] duration [727ms], collections [1]/[1.6s],
total [727ms]/[1.1m], memory [4.2gb]->[4.2gb]/[29.9gb], all_pools {[young]
[11.3mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[51.1mb]/[66.5mb]}{[old]
[4.2gb]->[4.2gb]/[29.3gb]}

On Thu, May 29, 2014 at 1:51 PM, Jorge Ferrando jorfermo@gmail.com wrote:

What could be the cause of that? Any update of elasticsearch? Any
configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:

I think but might be wrong that this node as unresponsive does not
collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on
old GC.

So memory pressure is the first guess. You may have too many old GC
cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting
before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your
requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to
get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time
it happened at around 9:47 the cluster became green and 9:50 the node
restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s],
total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools
{[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s],
total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic
ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ,
bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic
ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3
Virtual Machines in the same VMWare cluster with other hundreds of services
and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com
wrote:

Hi,

I may be wrong but it seems to me you have a problem with your
network. It may be a flaky connection, broken nic or something wrong with
your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes
restarts randomly and cluster gets to red state, then yellow, then green
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason
of what can be going on

I've found entries like these, but I don't know if they are in some
way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic
ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
[date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get
to the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
[456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic
ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic
ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic
ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #17

What gives older Marvel metrics?
What does the field data memory looks like?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 13:53, Jorge Ferrando jorfermo@gmail.com a écrit :

There are recent entries in the log (like 15 mins ago) about gc/young/old

[2014-05-29 13:37:34,183][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][38][5] duration [763ms], collections [1]/[1s], total [763ms]/[2.3s], memory [609.6mb]->[166.3mb]/[29.9gb], all_pools {[young] [528.7mb]->[29.8mb]/[532.5mb]}{[survivor] [64.3mb]->[66.5mb]/[66.5mb]}{[old] [16.5mb]->[69.9mb]/[29.3gb]}
[2014-05-29 13:51:17,798][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][846][205] duration [727ms], collections [1]/[1.6s], total [727ms]/[1.1m], memory [4.2gb]->[4.2gb]/[29.9gb], all_pools {[young] [11.3mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[51.1mb]/[66.5mb]}{[old] [4.2gb]->[4.2gb]/[29.3gb]}

On Thu, May 29, 2014 at 1:51 PM, Jorge Ferrando jorfermo@gmail.com wrote:
What could be the cause of that? Any update of elasticsearch? Any configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:
I think but might be wrong that this node as unresponsive does not collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:
It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on old GC.

So memory pressure is the first guess. You may have too many old GC cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:
GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state. Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] [5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:
I thought about that but It would be strange because they are 3 Virtual Machines in the same VMWare cluster with other hundreds of services and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:
Hi,

I may be wrong but it seems to me you have a problem with your network. It may be a flaky connection, broken nic or something wrong with your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <[hidden email]> wrote:
Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes restarts randomly and cluster gets to red state, then yellow, then green and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what can be going on

I've found entries like these, but I don't know if they are in some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the green state and suddenly node 3 autorestarted and now cluster is red with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML

View this message in context: Re: Nodes restarting automatically
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #18

I don't have older metrics on Marvel. I turned it on few days ago to see if
it could help with solving the problem

I couldn't find Field data memory in node statistics. Where can I find it?

On Thu, May 29, 2014 at 3:40 PM, David Pilato david@pilato.fr wrote:

What gives older Marvel metrics?
What does the field data memory looks like?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 13:53, Jorge Ferrando jorfermo@gmail.com a écrit :

There are recent entries in the log (like 15 mins ago) about gc/young/old

[2014-05-29 13:37:34,183][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][38][5] duration [763ms], collections [1]/[1s], total
[763ms]/[2.3s], memory [609.6mb]->[166.3mb]/[29.9gb], all_pools {[young]
[528.7mb]->[29.8mb]/[532.5mb]}{[survivor]
[64.3mb]->[66.5mb]/[66.5mb]}{[old] [16.5mb]->[69.9mb]/[29.3gb]}
[2014-05-29 13:51:17,798][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][846][205] duration [727ms], collections [1]/[1.6s],
total [727ms]/[1.1m], memory [4.2gb]->[4.2gb]/[29.9gb], all_pools {[young]
[11.3mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[51.1mb]/[66.5mb]}{[old]
[4.2gb]->[4.2gb]/[29.3gb]}

On Thu, May 29, 2014 at 1:51 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

What could be the cause of that? Any update of elasticsearch? Any
configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:

I think but might be wrong that this node as unresponsive does not
collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on
old GC.

So memory pressure is the first guess. You may have too many old GC
cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting
before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your
requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed
to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last
time it happened at around 9:47 the cluster became green and 9:50 the node
restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s],
total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools
{[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s],
total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic
ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ,
bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic
ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3
Virtual Machines in the same VMWare cluster with other hundreds of services
and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com
wrote:

Hi,

I may be wrong but it seems to me you have a problem with your
network. It may be a flaky connection, broken nic or something wrong with
your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the
nodes restarts randomly and cluster gets to red state, then yellow, then
green and it happens again in a loop (sometimes it even doesnt get green
state)

I've tried to look at the logs but i can't find and obvious reason
of what can be going on

I've found entries like these, but I don't know if they are in some
way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get
to the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ]
[elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections
[1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools
{[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ]
[elastic ASIC nodo 3] version[1.1.1], pid[7511],
build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ]
[elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ]
[elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-22 14:03:51,967][INFO ][node ]
[elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ]
[elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ]
[elastic ASIC nodo 3] [logstash-2014.05.21][1],
node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
[true]
org.elasticsearch.transport.SendRequestTransportException: [elastic
ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BLEmNx8TwRX7qHTqaRKneLx37xeSCT_rePcnUoG3pJHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #19

It's in index statistics under memory row.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 30 mai 2014 à 09:31:33, Jorge Ferrando (jorfermo@gmail.com) a écrit:

I don't have older metrics on Marvel. I turned it on few days ago to see if it could help with solving the problem

I couldn't find Field data memory in node statistics. Where can I find it?

On Thu, May 29, 2014 at 3:40 PM, David Pilato david@pilato.fr wrote:
What gives older Marvel metrics?
What does the field data memory looks like?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 13:53, Jorge Ferrando jorfermo@gmail.com a écrit :

There are recent entries in the log (like 15 mins ago) about gc/young/old

[2014-05-29 13:37:34,183][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][38][5] duration [763ms], collections [1]/[1s], total [763ms]/[2.3s], memory [609.6mb]->[166.3mb]/[29.9gb], all_pools {[young] [528.7mb]->[29.8mb]/[532.5mb]}{[survivor] [64.3mb]->[66.5mb]/[66.5mb]}{[old] [16.5mb]->[69.9mb]/[29.3gb]}
[2014-05-29 13:51:17,798][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][846][205] duration [727ms], collections [1]/[1.6s], total [727ms]/[1.1m], memory [4.2gb]->[4.2gb]/[29.9gb], all_pools {[young] [11.3mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[51.1mb]/[66.5mb]}{[old] [4.2gb]->[4.2gb]/[29.3gb]}

On Thu, May 29, 2014 at 1:51 PM, Jorge Ferrando jorfermo@gmail.com wrote:
What could be the cause of that? Any update of elasticsearch? Any configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:
I think but might be wrong that this node as unresponsive does not collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:
It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on old GC.

So memory pressure is the first guess. You may have too many old GC cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr wrote:
GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state. Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last time it happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] [5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com wrote:
I thought about that but It would be strange because they are 3 Virtual Machines in the same VMWare cluster with other hundreds of services and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com wrote:
Hi,

I may be wrong but it seems to me you have a problem with your network. It may be a flaky connection, broken nic or something wrong with your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] <[hidden email]> wrote:
Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes restarts randomly and cluster gets to red state, then yellow, then green and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what can be going on

I've found entries like these, but I don't know if they are in some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ] [elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get to the green state and suddenly node 3 autorestarted and now cluster is red with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ] [elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ] [elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ] [elastic ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML

View this message in context: Re: Nodes restarting automatically
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BLEmNx8TwRX7qHTqaRKneLx37xeSCT_rePcnUoG3pJHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53883ae9.51ead36b.28b%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


(Jorge Ferrando) #20

I guess it's this:

[image: Inline image 1]

On Fri, May 30, 2014 at 10:01 AM, David Pilato david@pilato.fr wrote:

It's in index statistics under memory row.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr

Le 30 mai 2014 à 09:31:33, Jorge Ferrando (jorfermo@gmail.com) a écrit:

I don't have older metrics on Marvel. I turned it on few days ago to see
if it could help with solving the problem

I couldn't find Field data memory in node statistics. Where can I find it?

On Thu, May 29, 2014 at 3:40 PM, David Pilato david@pilato.fr wrote:

What gives older Marvel metrics?
What does the field data memory looks like?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 13:53, Jorge Ferrando jorfermo@gmail.com a écrit :

There are recent entries in the log (like 15 mins ago) about
gc/young/old

[2014-05-29 13:37:34,183][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][38][5] duration [763ms], collections [1]/[1s],
total [763ms]/[2.3s], memory [609.6mb]->[166.3mb]/[29.9gb], all_pools
{[young] [528.7mb]->[29.8mb]/[532.5mb]}{[survivor]
[64.3mb]->[66.5mb]/[66.5mb]}{[old] [16.5mb]->[69.9mb]/[29.3gb]}
[2014-05-29 13:51:17,798][INFO ][monitor.jvm ] [elastic ASIC
nodo 3] [gc][young][846][205] duration [727ms], collections [1]/[1.6s],
total [727ms]/[1.1m], memory [4.2gb]->[4.2gb]/[29.9gb], all_pools {[young]
[11.3mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[51.1mb]/[66.5mb]}{[old]
[4.2gb]->[4.2gb]/[29.3gb]}

On Thu, May 29, 2014 at 1:51 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

What could be the cause of that? Any update of elasticsearch? Any
configuration parameter? What should I look for in the logs?

On Thu, May 29, 2014 at 10:51 AM, David Pilato david@pilato.fr wrote:

I think but might be wrong that this node as unresponsive does not
collect anymore GC data.
May be you could look in the past before things starting to be worse.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:43, Jorge Ferrando jorfermo@gmail.com a écrit :

This is what Marvel shows for old GC in the last 6 hours for that
node:

<image.png>

On Thu, May 29, 2014 at 10:39 AM, David Pilato david@pilato.fr wrote:

It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that
on old GC.

So memory pressure is the first guess. You may have too many old GC
cycles.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 10:32, Jorge Ferrando jorfermo@gmail.com a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?

On Thu, May 29, 2014 at 10:29 AM, David Pilato david@pilato.fr
wrote:

GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting
before a node is marked as unresponsive.

And if you are under memory pressure, you could try to check your
requests and see if you can have some optimization or start new nodes...

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 mai 2014 à 09:56, Jorge Ferrando jorfermo@gmail.com a écrit :

I've been analyzing the problem with Marvel and nagios and I
managed to get 2 more details:

  • The node restarting/reinitializing it's always the same. Node 3
  • It always happens quickly after getting the cluster in green state.
    Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:

log action execution errors for easier debugging

action: DEBUG

But i dont see anything in the log. For instance, this is the last
time it happened at around 9:47 the cluster became green and 9:50 the node
restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s],
total [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools
{[young] [421.5mb]->[8.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm ] [elastic
ASIC nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s],
total [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic
ASIC nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic
ASIC nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins ] [elastic
ASIC nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ,
bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic
ASIC nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic
ASIC nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?

On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando jorfermo@gmail.com
wrote:

I thought about that but It would be strange because they are 3
Virtual Machines in the same VMWare cluster with other hundreds of services
and nobody reported any networking problem.

On Thu, May 22, 2014 at 3:16 PM, emeschitc emeschitc@gmail.com
wrote:

Hi,

I may be wrong but it seems to me you have a problem with your
network. It may be a flaky connection, broken nic or something wrong with
your configuration for discovery and/or data transport ?

Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)

Check the status of the network on this node.

On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4056287&i=0> wrote:

Hello

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits,
and elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the
nodes restarts randomly and cluster gets to red state, then yellow, then
green and it happens again in a loop (sometimes it even doesnt get green
state)

I've tried to look at the logs but i can't find and obvious reason
of what can be going on

I've found entries like these, but I don't know if they are in
some way related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_end] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_start] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec ]
[elastic ASIC nodo 3] [logstash-2014.05.22] no index mapper found for
field: [date_start.raw] returning default postings format

For instance right now it was in yellow state, really close to get
to the green state and suddenly node 3 autorestarted and now cluster is red
with 2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm ]
[elastic ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections
[1]/[1s], total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools
{[young] [456mb]->[7.2mb]/[532.5mb]}{[survivor]
[66.5mb]->[66.5mb]/[66.5mb]}{[old] [6gb]->[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ]
[elastic ASIC nodo 3] version[1.1.1], pid[7511],
build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ]
[elastic ASIC nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins ]
[elastic ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-22 14:03:51,967][INFO ][node ]
[elastic ASIC nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ]
[elastic ASIC nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type ]
[elastic ASIC nodo 3] [logstash-2014.05.21][1],
node[jgwbxcBoTVa3JIIG5a_FJA], [P], s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@42b80f4a]
lastShard [true]
org.elasticsearch.transport.SendRequestTransportException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301
]][search/phase/query]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at
org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
at
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not
connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 50 more

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4056276&i=0.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fa53a41d-064b-4250-8003-31cf845b7216%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276.html
To unsubscribe from ElasticSearch Users, click here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nodes restarting automatically
http://elasticsearch-users.115913.n3.nabble.com/Nodes-restarting-automatically-tp4056276p4056287.html
Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAE6dBgjyXAM8ELYJ8AKAx6f5pSxri%3DNk1Oq%3Dx%3D5MCp5qYSzuug%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5ArT-7tCh_f%2B9XAH5UfnsjWaBrMG0sacqUrL7T6JV9r7Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/AA94DDC8-AC14-47E2-80D5-6B670FF8D9E7%40pilato.fr?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CqL5ss7MbtO0L481XXkycTdz2qFSH%3DnPvu7P_W_3CiKg%40mail.gmail.com?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/F40FD3BA-135B-49B9-B2CF-0E68D58D9B5D%40pilato.fr?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BzJJ3Hy0CJeJ_zXBSFt7iGRPav%2BSXN8KJ1-ixFNPviUg%40mail.gmail.com?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/332CAAEE-2BB9-46F9-A0E3-94D4AD30B21D%40pilato.fr?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5D0Ws_4ZR%2B-%2B%2Bw3iEKKYNHKVTNDr_av56WQCk--b09-jw%40mail.gmail.com?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/ACCBDCBC-7F24-4654-BFE4-A2F0D16D8120%40pilato.fr?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BLEmNx8TwRX7qHTqaRKneLx37xeSCT_rePcnUoG3pJHQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5BLEmNx8TwRX7qHTqaRKneLx37xeSCT_rePcnUoG3pJHQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yBqA-XjzqmM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.53883ae9.51ead36b.28b%40MacBook-Air-de-David.local
https://groups.google.com/d/msgid/elasticsearch/etPan.53883ae9.51ead36b.28b%40MacBook-Air-de-David.local?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGJ4z5CeUhO2C7y0PTZ8A4HS32yFNY_Mc97Y4SJLGAjSNvEh4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.