Failed Action

Last night I upgraded my Elasticsearch data nodes from 4 CPUs to 8 CPUs and now I am getting these errors from my Logstash shippers. Please help!!!!!

[2017-02-08T15:31:14,344][WARN ][logstash.outputs.elasticsearch] Failed action.

{:status=>500, :action=>["index", {:_id=>nil, :_index=>"logstash-2017.02.08", :_type=>"logs", :_routing=>nil},

####<LOG MESSAGE REMOVED>####,

:response=>{"index"=>{"_index"=>"logstash-2017.02.08",
  "_type"=>"logs",
  "_id"=>"AVofozXYJ8M9dMOnBjdZ",
  "status"=>500,
  "error"=>{"type"=>"illegal_state_exception",
  "reason"=>"Message not fully read (request) for requestId [151918],
  action [indices:data/write/bulk[s]],
  available [30016]; resetting"}}}}

Sooooooo,
I deleted my indexes, then I reinstalled Elasticsearch.

STILL getting this error.

Please help.

So,
I gave up. I formatted and started over. I rebuilt my entire cluster (6 data nodes, 3 master nodes, and 1 Kibana node).

I am STILL getting these errors.

This has gotten too ridiculous for words. I upgraded my data nodes last night from 4 CPUs to 8 CPUs and everything has gone downhill from there.

I have noticed I am still getting these errors. Do anyone have any ideas?

Anyone? Bueller?

Hey,

instead of just bumping this, you could actively provide some more information. I am sure, that increasing the number of CPUs is not the core of your problem.

  • Which logstash version are you using?
  • Which JVM do you run on?
  • How did you install logstash? (RPM, DEB, zip, etc)
  • Which Elasticsearch version are you using?
  • Which JVM do you run on?
  • How did you install Elasticsearch? (RPM, DEB, zip, etc)
  • Did you check the Elasticsearch log files?
  • How does you logstash config look like?
  • How does your Elasticsearch config look like? What is the difference from a standard configuration?
  • Can you reproduce this with a regular HTTP request?

--Alex

I have had several posts related to errors like this issue up. So, sorry if I did not include everything in this one.

All Logstash versions are 5.2.1 and the ES cluster servers are 5.2.1 as well.

master name         version role disk.avail heap.max ram.max ram.current cpu uptime jdk
-      DATANODE-01  5.2.1   d         1.3tb   29.9gb  39.2gb        39gb  25   3.9d 1.8.0_121
-      DATANODE-02  5.2.1   d         1.3tb   29.9gb  39.2gb        38gb  14   3.9d 1.8.0_121
-      DATANODE-03  5.2.1   d         1.4tb   29.9gb  39.2gb      38.8gb  32   3.9d 1.8.0_121
-      DATANODE-04  5.2.1   d         1.3tb   29.9gb  39.2gb      38.8gb  28   3.9d 1.8.0_121
-      DATANODE-06  5.2.1   d         1.4tb   29.9gb  39.2gb      38.2gb   1   3.9d 1.8.0_121
-      KBNANODE-01  5.2.1   -        39.1gb    7.9gb  15.6gb        10gb   0   3.9d 1.8.0_121
-      MSTRNODE-01  5.2.1   m        39.9gb    7.9gb  15.6gb       9.2gb   0   3.9d 1.8.0_121
-      MSTRNODE-02  5.2.1   m        39.9gb    7.9gb  15.6gb       9.2gb   0   3.9d 1.8.0_121
*      MSTRNODE-03  5.2.1   m        39.8gb    7.9gb  15.6gb       9.2gb   0   3.9d 1.8.0_121

I installed Logstash on the shippers via apt-get and ES on the clusters via apt-get.

Not sure which file you are referring to for the Logstash config. (Logstash.yml?)

Same for Elasticsearch (Elasticsearch.yml?)

I do not know how to reproduce the error.

so, have you checked all the log files of your nodes for the above error messages? It has to come from somewhere and it is likely to be logged. Is there a stack trace? Thats one of the more important bits to get.

Is there any special configuration in the elasticsearch yml file.

Is there any special configuration in the logstash configuration file? Can you show the elasticsearch output configuration section?

I have not seen any from any of the ES cluster servers.

Nothing unique. Basic options configured.

Sure, here is my Logstash output:

##########
# ELASTICSAERCH Output Parameters
##########
output {
  elasticsearch {
    hosts => ["http://x.x.x.x:9200","http://x.x.x.x:9200","http://x.x.x.x:9200","http://x.x.x.x:9200","http://x.x.x.x:9200"]
    index => "logstash-%{+YYYY.MM.dd}"
  }
}
##################################################

This is the same output configuration for all 30 Logstash shippers.

Hey,

given the error message, I am still pretty sure somewhere must be an exception in the logs - at least I hope so to move forward. Either in the node which received the bulk requests or one of the data nodes where parts of the bulk request were forwarded to.

Can you search for Message not fully read on all of your nodes in the logs? I am pretty surprised by that message, given you are using the same version on all of your nodes, so I would like to get my hands on a stack trace. The message indicates, that on the transport protocol level an unexpected message length was sent from one node to another. On HTTP level from logstash to ES this means everything was fine.

--Alex

I will do some digging to see what I can find.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.