"failed to execute on node" Exception on elasticsearch 2.3.2

macymin · May 10, 2016, 8:48am

hi I have below exception consistently for all my three nodes' elasticsearch log file. The health status is green, can anyone help?? I think it causes the elasticsearch down after sometime as I just experienced one today (5/10) since last friday (5/6)'s cluster upgrade.

[2016-05-10 16:12:39,369][DEBUG][action.admin.cluster.node.stats] [fslelkprod01] failed to execute on node [QSXAvrCzQQGDoprePsPzTQ]
RemoteTransportException[[fslelkprod01][fslelkprod01/10.193.91.25:9300][cluster:monitor/nodes/stats[n]]]; nested: AlreadyClosedException[this IndexReader is closed];
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed
at org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:274)
at org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:101)
at org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:55)
at org.apache.lucene.index.IndexReader.leaves(IndexReader.java:438)
at org.elasticsearch.search.suggest.completion.Completion090PostingsFormat.completionStats(Completion090PostingsFormat.java:330)
at org.elasticsearch.index.shard.IndexShard.completionStats(IndexShard.java:765)
at org.elasticsearch.action.admin.indices.stats.CommonStats.(CommonStats.java:164)
at org.elasticsearch.indices.IndicesService.stats(IndicesService.java:253)
at org.elasticsearch.node.service.NodeService.stats(NodeService.java:158)
at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:82)
at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:44)
at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:92)
at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:230)
at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:226)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-05-10 16:12:39,371][ERROR][marvel.agent.collector.node] [fslelkprod01] collector [node-stats-collector] - failed collecting data
java.lang.ArrayIndexOutOfBoundsException: 0
at org.elasticsearch.action.support.nodes.BaseNodesResponse.getAt(BaseNodesResponse.java:72)
at org.elasticsearch.marvel.agent.collector.node.NodeStatsCollector.doCollect(NodeStatsCollector.java:88)
at org.elasticsearch.marvel.agent.collector.AbstractCollector.collect(AbstractCollector.java:99)
at org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:187)
at java.lang.Thread.run(Thread.java:745)

polyfractal · May 10, 2016, 3:30pm

Unfortunately, this was a bug introduced in 2.3.0. It was just fixed and will be released with 2.3.2: https://github.com/elastic/elasticsearch/pull/18094

macymin · May 11, 2016, 2:03am

hi Zachary, thanks for the information. so how can we work around this before the 2.3.3 with bug fix is released?

polyfractal · May 11, 2016, 2:47am

Is your JVM crashing? Does it throw this exception right before crash?

Unless you fall into the edge case with mmapfs crashing, this is basically a harmless exception. You'll see it spammed in your log a lot, but it's otherwise not a problem (the completion stats will be computed incorrectly, that's all).

macymin · May 11, 2016, 3:18am

i tried to search hs_err_pidXXXX.log, didnt see it. does it mean JVM not crash?

bblank · August 31, 2016, 8:27pm

Looks like I am hitting the same thing with 2.3.1. My indexing rates are really low compared to normal. My cluster log file only has a few entries... It doesn't appear to be a harmless event to our cluster. Is there anything else I can check?

MarkOStewart · September 8, 2016, 12:20am

I would start a new thread as this one is a few months old and seems to have lived its useful life.

That being said have you set the swappiness and vm.max_max_count. This made one of our clusters much happier on resources and stopped the swapping and helped i/o and index much faster.

Hopefully you are on Linux? Another reason to start new thread you can give info about your setup.

run sysctl vm.max_map_count it should return 262144
and cat /proc/sys/vm/swappiness should return 1 or 0

sysctl -w vm.max_map_count=262144

Topic		Replies	Views
Failed to execute on node Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed Elasticsearch	2	1212	February 7, 2018
Error on Master Nodes Elasticsearch	3	1807	July 5, 2017
org.apache.lucene.store.AlreadyClosedException Elasticsearch	2	1627	July 5, 2017
Elasticsearch failed to execute on node Elasticsearch	5	10932	July 5, 2017
RemoteTransportException - AlreadyClosedException[this IndexReader is closed] Elasticsearch	6	4875	July 5, 2017

"failed to execute on node" Exception on elasticsearch 2.3.2

Related topics