Error with shards, not enough debug messages

Anjan_Pathak · March 11, 2015, 1:22pm

Hi Guys,

We were using elastic search 1.2.1 and suddenly everything stopped working
with queries failing. Looking at the logs, it was throwing messages like
this:

[2015-03-11 18:47:37,183][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

We deleted everything and even upgraded the instance to the latest version[elasticsearch-1.4.4].
But still queries are not working. All our data is in MySQL and our index
is very small. Deleted all the index, recreated it after the upgrade but
still not able to recover from the error. There is something particular
with our Production machine as everything is working well in our Test
server. But elastic search is not giving us enough debug to diagnose the
problem. The latest error message after restart of elasticsearch is this:

[vantage@vc-prod elasticsearch-1.4.4]$ bin/elasticsearch

[2015-03-11 18:46:17,904][INFO ][node ] [Doctor
Strange] version[1.4.4], pid[13798], build[c88f77f/2015-02-19T13:05:36Z]

[2015-03-11 18:46:17,905][INFO ][node ] [Doctor
Strange] initializing ...

[2015-03-11 18:46:17,911][INFO ][plugins ] [Doctor
Strange] loaded [], sites []

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor
Strange] initialized

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor
Strange] starting ...

[2015-03-11 18:46:21,626][INFO ][transport ] [Doctor
Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/128.199.143.152:9300]}

[2015-03-11 18:46:21,700][INFO ][discovery ] [Doctor
Strange] elasticsearch/2bQbRn93SOeDdmDLHijwBg

[2015-03-11 18:46:25,482][INFO ][cluster.service ] [Doctor
Strange] new_master [Doctor
Strange][2bQbRn93SOeDdmDLHijwBg][vc-prod][inet[/128.199.143.152:9300]],
reason: zen-disco-join (elected_as_master)

[2015-03-11 18:46:25,519][INFO ][http ] [Doctor
Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/128.199.143.152:9200]}

[2015-03-11 18:46:25,520][INFO ][node ] [Doctor
Strange] started

[2015-03-11 18:46:26,459][INFO ][gateway ] [Doctor
Strange] recovered [2] indices into cluster_state

[2015-03-11 18:46:26,639][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:26,862][WARN ][indices.cluster ] [Doctor
Strange] [thedealspoint][0] failed to start shard

org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[thedealspoint][0] failed recovery

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.elasticsearch.index.engine.FlushFailedEngineException:
[thedealspoint][0] Flush failed

at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:926)

at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryFinalization(InternalIndexShard.java:749)

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:291)

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

... 3 more

Caused by: java.lang.NullPointerException

at
org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$CompletionLookupProvider.parsePayload(Completion090PostingsFormat.java:337)

at
org.elasticsearch.search.suggest.completion.AnalyzingCompletionLookupProvider$CompletionPostingsConsumer.addPosition(AnalyzingCompletionLookupProvider.java:189)

at
org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$GroupedPostingsConsumer.addPosition(Completion090PostingsFormat.java:188)

at
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:488)

at
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)

at
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)

at
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:439)

at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:513)

at
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:624)

at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2949)

at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3104)

at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3071)

at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:916)

... 6 more

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor
Strange] [thedealspoint][0] sending failed shard for [thedealspoint][0],
node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID
[vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery];
nested: FlushFailedEngineException[[thedealspoint][0] Flush failed];
nested: NullPointerException; ]]

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor
Strange] [thedealspoint][0] received shard failed for [thedealspoint][0],
node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID
[vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery];
nested: FlushFailedEngineException[[thedealspoint][0] Flush failed];
nested: NullPointerException; ]]

[2015-03-11 18:46:28,818][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:30,173][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:33,339][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

Thanks for the help in advance.

Anjan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/628838f3-1c7e-40cd-afe1-ca8848d39f78%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · March 11, 2015, 2:32pm

I think you should open an issue with the full description you wrote so far.
NullPointerException is something we should avoid...

Best

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 11 mars 2015 à 06:22, Anjan Pathak anjan.pathak@vantagecircle.com a écrit :

Hi Guys,

We were using Elasticsearch 1.2.1 and suddenly everything stopped working with queries failing. Looking at the logs, it was throwing messages like this:

[2015-03-11 18:47:37,183][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

We deleted everything and even upgraded the instance to the latest version[elasticsearch-1.4.4]. But still queries are not working. All our data is in MySQL and our index is very small. Deleted all the index, recreated it after the upgrade but still not able to recover from the error. There is something particular with our Production machine as everything is working well in our Test server. But Elasticsearch is not giving us enough debug to diagnose the problem. The latest error message after restart of elasticsearch is this:

[vantage@vc-prod elasticsearch-1.4.4]$ bin/elasticsearch

[2015-03-11 18:46:17,904][INFO ][node ] [Doctor Strange] version[1.4.4], pid[13798], build[c88f77f/2015-02-19T13:05:36Z]

[2015-03-11 18:46:17,905][INFO ][node ] [Doctor Strange] initializing ...

[2015-03-11 18:46:17,911][INFO ][plugins ] [Doctor Strange] loaded , sites

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor Strange] initialized

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor Strange] starting ...

[2015-03-11 18:46:21,626][INFO ][transport ] [Doctor Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/128.199.143.152:9300]}

[2015-03-11 18:46:21,700][INFO ][discovery ] [Doctor Strange] elasticsearch/2bQbRn93SOeDdmDLHijwBg

[2015-03-11 18:46:25,482][INFO ][cluster.service ] [Doctor Strange] new_master [Doctor Strange][2bQbRn93SOeDdmDLHijwBg][vc-prod][inet[/128.199.143.152:9300]], reason: zen-disco-join (elected_as_master)

[2015-03-11 18:46:25,519][INFO ][http ] [Doctor Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/128.199.143.152:9200]}

[2015-03-11 18:46:25,520][INFO ][node ] [Doctor Strange] started

[2015-03-11 18:46:26,459][INFO ][gateway ] [Doctor Strange] recovered [2] indices into cluster_state

[2015-03-11 18:46:26,639][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:26,639][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:26,862][WARN ][indices.cluster ] [Doctor Strange] [thedealspoint][0] failed to start shard

org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [thedealspoint][0] failed recovery

at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.elasticsearch.index.engine.FlushFailedEngineException: [thedealspoint][0] Flush failed

at org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:926)

at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryFinalization(InternalIndexShard.java:749)

at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:291)

at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

... 3 more

Caused by: java.lang.NullPointerException

at org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$CompletionLookupProvider.parsePayload(Completion090PostingsFormat.java:337)

at org.elasticsearch.search.suggest.completion.AnalyzingCompletionLookupProvider$CompletionPostingsConsumer.addPosition(AnalyzingCompletionLookupProvider.java:189)

at org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$GroupedPostingsConsumer.addPosition(Completion090PostingsFormat.java:188)

at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:488)

at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)

at org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)

at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:439)

at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:513)

at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:624)

at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2949)

at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3104)

at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3071)

at org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:916)

... 6 more

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor Strange] [thedealspoint][0] sending failed shard for [thedealspoint][0], node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID [vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery]; nested: FlushFailedEngineException[[thedealspoint][0] Flush failed]; nested: NullPointerException; ]]

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor Strange] [thedealspoint][0] received shard failed for [thedealspoint][0], node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID [vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery]; nested: FlushFailedEngineException[[thedealspoint][0] Flush failed]; nested: NullPointerException; ]]

[2015-03-11 18:46:28,818][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:30,173][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:33,339][DEBUG][action.search.type ] [Doctor Strange] All shards failed for phase: [query_fetch]

Thanks for the help in advance.

Anjan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/628838f3-1c7e-40cd-afe1-ca8848d39f78%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/B2FDD0F2-9081-420B-8C03-FBB906DA90A5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Anjan_Pathak · March 11, 2015, 2:51pm

Thanks David for the reply. But things were very strange. I have now
recovered from the issue by creating a completely new machine and
installing Elasticsearch and doing a fresh indexing. I could do this only
because my index was small. But to give a bit more background, I had
my Elasticsearch 1.2.1 installed my RPM and then as I could not recover, I
installed a new one in a new directory by doing a unzip. I then tried
everything to reindex but nothing worked. The index status will stay yellow
for a while and will suddenly turn red and the queries will stop working. I
would like to use my old installation but I am afraid it is not starting up
with my index. I will go and open an issue still.

Thanks,
Anjan

On Wednesday, March 11, 2015 at 8:02:24 PM UTC+5:30, David Pilato wrote:

I think you should open an issue with the full description you wrote so
far.
NullPointerException is something we should avoid...

Best

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 11 mars 2015 à 06:22, Anjan Pathak <anjan....@vantagecircle.com
<javascript:>> a écrit :

Hi Guys,

We were using Elasticsearch 1.2.1 and suddenly everything stopped working
with queries failing. Looking at the logs, it was throwing messages like
this:

[2015-03-11 18:47:37,183][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

We deleted everything and even upgraded the instance to the latest version[elasticsearch-1.4.4].
But still queries are not working. All our data is in MySQL and our index
is very small. Deleted all the index, recreated it after the upgrade but
still not able to recover from the error. There is something particular
with our Production machine as everything is working well in our Test
server. But Elasticsearch is not giving us enough debug to diagnose the
problem. The latest error message after restart of elasticsearch is this:

[vantage@vc-prod elasticsearch-1.4.4]$ bin/elasticsearch

[2015-03-11 18:46:17,904][INFO ][node ] [Doctor
Strange] version[1.4.4], pid[13798], build[c88f77f/2015-02-19T13:05:36Z]

[2015-03-11 18:46:17,905][INFO ][node ] [Doctor
Strange] initializing ...

[2015-03-11 18:46:17,911][INFO ][plugins ] [Doctor
Strange] loaded , sites

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor
Strange] initialized

[2015-03-11 18:46:21,450][INFO ][node ] [Doctor
Strange] starting ...

[2015-03-11 18:46:21,626][INFO ][transport ] [Doctor
Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/128.199.143.152:9300]}

[2015-03-11 18:46:21,700][INFO ][discovery ] [Doctor
Strange] elasticsearch/2bQbRn93SOeDdmDLHijwBg

[2015-03-11 18:46:25,482][INFO ][cluster.service ] [Doctor
Strange] new_master [Doctor
Strange][2bQbRn93SOeDdmDLHijwBg][vc-prod][inet[/128.199.143.152:9300]],
reason: zen-disco-join (elected_as_master)

[2015-03-11 18:46:25,519][INFO ][http ] [Doctor
Strange] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/128.199.143.152:9200]}

[2015-03-11 18:46:25,520][INFO ][node ] [Doctor
Strange] started

[2015-03-11 18:46:26,459][INFO ][gateway ] [Doctor
Strange] recovered [2] indices into cluster_state

[2015-03-11 18:46:26,639][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:26,639][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:26,862][WARN ][indices.cluster ] [Doctor
Strange] [thedealspoint][0] failed to start shard

org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[thedealspoint][0] failed recovery

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.elasticsearch.index.engine.FlushFailedEngineException:
[thedealspoint][0] Flush failed

at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:926)

at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryFinalization(InternalIndexShard.java:749)

at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:291)

at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

... 3 more

Caused by: java.lang.NullPointerException

at
org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$CompletionLookupProvider.parsePayload(Completion090PostingsFormat.java:337)

at
org.elasticsearch.search.suggest.completion.AnalyzingCompletionLookupProvider$CompletionPostingsConsumer.addPosition(AnalyzingCompletionLookupProvider.java:189)

at
org.elasticsearch.search.suggest.completion.Completion090PostingsFormat$GroupedPostingsConsumer.addPosition(Completion090PostingsFormat.java:188)

at
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:488)

at
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)

at
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)

at
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:439)

at
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:513)

at
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:624)

at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2949)

at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3104)

at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3071)

at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:916)

... 6 more

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor
Strange] [thedealspoint][0] sending failed shard for [thedealspoint][0],
node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID
[vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery];
nested: FlushFailedEngineException[[thedealspoint][0] Flush failed];
nested: NullPointerException; ]]

[2015-03-11 18:46:26,890][WARN ][cluster.action.shard ] [Doctor
Strange] [thedealspoint][0] received shard failed for [thedealspoint][0],
node[2bQbRn93SOeDdmDLHijwBg], [P], s[INITIALIZING], indexUUID
[vjORGaDpSsKPcGddR2ELdg], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[thedealspoint][0] failed recovery];
nested: FlushFailedEngineException[[thedealspoint][0] Flush failed];
nested: NullPointerException; ]]

[2015-03-11 18:46:28,818][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:30,173][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

[2015-03-11 18:46:33,339][DEBUG][action.search.type ] [Doctor
Strange] All shards failed for phase: [query_fetch]

Thanks for the help in advance.

Anjan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/628838f3-1c7e-40cd-afe1-ca8848d39f78%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/628838f3-1c7e-40cd-afe1-ca8848d39f78%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f5f85b46-0379-47f9-9f27-4f122186fa38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.