Failed to execute org.elasticsearch.action.search.SearchRequest error

bluethundr · July 12, 2015, 12:59am

Hey guys,

I noticed that results stopped turning up in Logstash. And when I tailed the Elasticsearch logs, I noticed this exception repeated over and over again, a total of 85 times:

    #grep "Failed to execute \[org.elasticsearch.action.search.SearchRequest@2c539ad4\] lastShard \[true\]" /var/log/elasticsearch/elasticsearch.log | wc -l
 85

I don't see these errors in the ES logs on the other two nodes. I only see it on the master node.

Here's the full output of the error:

[2015-07-11 20:23:56,239][DEBUG][action.search.type       ] [JF_ES1] All shards failed for phase: [query]
org.elasticsearch.transport.RemoteTransportException: [JF_ES2][inet[/10.10.10.26:9300]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.SearchParseException: [logstash-2015.07.11][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{
  "query" : {
      "_id" : "AU56fE-yr7opIbuImhJ1"
 }
}]]
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:735)
        at org.elasticsearch.search.SearchService.createContext(SearchService.java:560)
        at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:532)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:294)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.query.QueryParsingException: [logstash-2015.07.11] [_na] query malformed, no field after start_object
        at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:295)
        at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:382)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:281)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:276)
        at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:719)
... 10 more

In taking a look through the log, I tailed back around 500 lines and examined the output. All of the errors starting with " Failed to execute [org.elasticsearch.action.search.SearchRequest@2c539ad4]" appear to have identical output, except for the fact that they are for different logstash indexes varying by date. They all appear to relate to java/groovy errors.

If I do an elasticsearch query based on the "_id" field I see in the error from that log entry I see that I do receive an error that's probably the same error that Logstash is getting when it tries to query Elasticsearch:

 #curl -XGET "http://localhost:9200/logstash-2015.07.11/_search?pretty=true" -d '{
>   "query" : {
>       "_id" : "AU56fE-yr7opIbuImhJ1"
>  }
> }'
{
  "error" : "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[TxwvI4HLSPqC_E8AEBwzEQ][logstash-2015.07.11][0]: RemoteTransportException[[JF_ES3][inet[/66.147.235.108:9300]][indices:data/read/search[phase/query]]]; nested: SearchParseException[[logstash-2015.07.11][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"query\" : {\n      \"_id\" : \"AU56fE-yr7opIbuImhJ1\"\n }\n}]]]; nested: QueryParsingException[[logstash-2015.07.11] [_na] query malformed, no field after start_object]; }{[pIegEsgDSAGF4Vgz4JZ42g][logstash-2015.07.11][1]: RemoteTransportException[[JF_ES2][inet[/10.10.10.26:9300]][indices:data/read/search[phase/query]]]; nested: SearchParseException[[logstash-2015.07.11][1]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"query\" : {\n      \"_id\" : \"AU56fE-yr7opIbuImhJ1\"\n }\n}]]]; nested: QueryParsingException[[logstash-2015.07.11] [_na] query malformed, no field after start_object]; }{[8g0BOveySTyrp27oheNqsQ][logstash-2015.07.11][2]: SearchParseException[[logstash-2015.07.11][2]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"query\" : {\n      \"_id\" : \"AU56fE-yr7opIbuImhJ1\"\n }\n}]]]; nested: QueryParsingException[[logstash-2015.07.11] [_na] query malformed, no field after start_object]; }{[pIegEsgDSAGF4Vgz4JZ42g][logstash-2015.07.11][3]: RemoteTransportException[[JF_ES2][inet[/10.10.10.26:9300]][indices:data/read/search[phase/query]]]; nested: SearchParseException[[logstash-2015.07.11][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"query\" : {\n      \"_id\" : \"AU56fE-yr7opIbuImhJ1\"\n }\n}]]]; nested: QueryParsingException[[logstash-2015.07.11] [_na] query malformed, no field after start_object]; }{[8g0BOveySTyrp27oheNqsQ][logstash-2015.07.11][4]: SearchParseException[[logstash-2015.07.11][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"query\" : {\n      \"_id\" : \"AU56fE-yr7opIbuImhJ1\"\n }\n}]]]; nested: QueryParsingException[[logstash-2015.07.11] [_na] query malformed, no field after start_object]; }]",
  "status" : 400
}

I'm also seeing these errors less frequently that are probably related to the above:

[2015-07-11 20:43:54,300][DEBUG][action.search.type       ] [JF_ES1] All shards failed for phase: [query]
org.elasticsearch.transport.RemoteTransportException: [JF_ES2][inet[/10.10.10.26:9300]][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.SearchParseException: [logstash-2015.07.11][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{
  "query" : {
   "_id" : "AU56fE-yr7opIbuImhJ1"
  }
}]]
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:735)
        at org.elasticsearch.search.SearchService.createContext(SearchService.java:560)
        at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:532)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:294)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:776)
        at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:767)
        at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.query.QueryParsingException: [logstash-2015.07.11] [_na] query malformed, no field after start_object
        at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:295)
        at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:382)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:281)
        at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:276)
        at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
        at org.elasticsearch.search.SearchService.parseSource(SearchService.java:719)
        ... 10 more

And that error shares the same "_id" field as the previous error:

"_id" : "AU56fE-yr7opIbuImhJ1"

So I'm thinking that these may be related to java /groovy. And I was wondering if someone had a suggestion on how to correct this problem.

This is all I have to my elasticsearch yaml... deleting the commented out lines and empty lines from the output:

#grep -v '#' /etc/elasticsearch/elasticsearch.yml |sed '/^\s*$/d'
node.name: "JF_ES1"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["localhost"]

My other two nodes in the cluster have this:

 #grep -v '#' /etc/elasticsearch/elasticsearch.yml |sed '/^\s*$/d'
node.name: "JF_ES2"
node.data: true
node.master: false
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.10.10.25"]

#grep -v '#' /etc/elasticsearch/elasticsearch.yml |sed '/^\s*$/d'
node.name: "JF_ES3"
node.master: false
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["10.10.10.25"]

It's not much. Pretty straight forward and basic from a configuration standpoint. I'm wondering if these errors that I'm getting are indeed Java/groovy related errors.

I'd appreciate some help in getting past this problem!

Thanks

warkolm · July 12, 2015, 8:42am

You probably don't want to search on _id, use a direct get instead.

However that shouldn't stop Logstash at all so I doubt they are related. What do your LS logs show?

bluethundr · July 12, 2015, 7:09pm

[quote="warkolm, post:2, topic:25373, full:true"]
You probably don't want to search on _id, use a direct get instead.

However that shouldn't stop Logstash at all so I doubt they are related. What do your LS logs show?
[/quote

Hmm.. ok. Well on the Logstash side of things I see this message repeated:

{:timestamp=>"2015-07-10T20:39:54.411000-0400", :message=>"retrying failed action with response code: 429", :level=>:warn}

This happens 1830 times in the logs

#grep "retrying failed action with response code: 429" /var/log/logstash/logstash.log | wc -l
1830

This is the first timestamp of when those messages started occurring:

2015-07-01T22:57:23.865000-0400

The only other errors I've seen are these:

{:timestamp=>"2015-07-01T22:52:58.248000-0400", :message=>"Exception in lumberjack input", :exception=>#<LogStash::ShutdownSignal: LogStash::ShutdownSignal>, :level=>:error}

Also I notice if I shut everything down and start everything up again, Logstash will show result for the next few hours probably. But then it'll just stop showing results yet again.

It's a little annoying. But what do those errors that I see in ES mean? And how can I get everything to run more reliably?

warkolm · July 12, 2015, 9:20pm

429 is usually a queue full response, so it sounds like you are overloading your node.
Can you add more resources to it?

bluethundr · July 12, 2015, 10:44pm

Hmm ok, yeah that's probably the case!! I guess I can add some resources to the setup. But it's probably going to take some time.

The logstash host is also the first ES node. It has 6GB of Ram and 45GB of HD space. The other 2 ES nodes only have 2GB ram and 25GB HD space. I have logstash forwarder running on 25 linux nodes of varying resources.

Is there any good documentation on how to properly allocate resources to an ELK setup? Got any recommendations?

warkolm · July 12, 2015, 10:47pm

How much heap have you assigned to ES?
How much data is in your cluster?
How are you monitoring things?

bluethundr · July 12, 2015, 11:08pm

How would I determine that?

There's roughly speaking about 85GB of log data among the 25 hosts. Generally about 3GB of logs per host, with one host topping out at 8GB.

With a combination of nagios and munin.

warkolm · July 12, 2015, 11:10pm

That's not much data on a large number of hosts.

Are you using default settings?

bluethundr · July 12, 2015, 11:11pm

Yep... just went with the defaults.

warkolm · July 12, 2015, 11:13pm

Right, cause that won't give ES much heap, 1GB max.

Check out https://www.elastic.co/guide/en/elasticsearch/guide/current/deploy.html

bluethundr · July 12, 2015, 11:14pm

Got it! I'll check that out. Thanks!

Topic		Replies	Views
Failed to execute phase [query_fetch], total failure Elasticsearch	7	1059	July 6, 2017
Elasticsearch throws exceptions while using with logstash and kibana Elasticsearch	3	820	July 6, 2017
SearchPhaseExecutionException on wild loop Elasticsearch	16	411	July 6, 2017
Communication Problem between logstash and elasticsearch Elasticsearch	3	1318	July 6, 2017
All shards failed for phase Elasticsearch elastic-stack-monitoring	2	3762	July 6, 2017

Failed to execute org.elasticsearch.action.search.SearchRequest error

Related topics