Hello! I thought I had Logstash all figured out but apparently I don't. It worked fine for me up until today. It indexed the log files I wanted to but today it just won't index anything at all and in the log I keep getting the error message "message=>"retrying failed action with response code: 503", :level=>:warn}"
I have searched around a bit and I have no idea what is going on.
Apparently it can read what happened from yesterday and the days before that but not today.
How's the health of you ES cluster? Anything interesting in the logs?
The logs are saying stuff like "Too many attempts at sending events. Dropping "event name"
This is what it looks like
stash_1 | {:timestamp=>"2015-07-03T09:12:12.540000+0000", :message=>"too many attempts at sending event. dropping: 2015-07-03T09:05:53.514Z 1100e328950e 2015-0 7-03 09:05:53,039+0000 DEBUG [qtp2092176926-68027] *UNKNOWN org.sonatype.nexus.content.internal.ContentAuthenticationFilter - Attempting to authenticate Subject as Anonymous request...", :level=>:error}
How do you see the health of your cluster? After some research my best guess is that the logstash has stashed to many events or something. Is there a way to fix this?
The cluster health API is useful, and you should definitely run an ES dashboard plugin like kopf. Sorry for being unclear about the logs; it's the ES logs I'm talking about. ES is the software that's misbehaving so it's likely that its logs will be the most interesting.
Where do I find the ES logs naturally? My best guess is that the index is full? Because the log message says that it attempts to send events but fails. There are a lot of words I dont' really know about like "index, indices" and such.
The ES logs are normally found in /var/log/elasticsearch. There's no such things as a full index but you may very well have run out of RAM.
There are a lot of words I dont' really know about like "index, indices" and such.
I suggest you read Elasticsearch: The Definitive Guide (at least the intro chapter(s)). The Elasticsearch Reference document also contains explanations of indexes, shards, and so on. Without a basic understanding of those concepts this is going to be a bumpy ride for you.
The log is saying
{:timestamp=>"2015-07-03T11:42:00.292000+0000", :message=>"retrying failed action with response code: 503", :level=>:warn}
and
{:timestamp=>"2015-07-03T11:45:04.858000+0000", :message=>"too many attempts at sending event. dropping: 2015-07-03T11:35:58.600Z 7738b91a24ce Using data from old metadata for com/jeppesen/jcms/ioserver.sdk/19.2.0/ioserver.sdk-19.2.0.rpm", :level=>:error}
That's the Logstash logfile. What's in the Elasticsearch logfile?
Because of the error message I get it feels like it has stashed away the events from all the logs that I've indexed. Is there some way to remove the events that I've stashed?
Nothing is being "stashed". If Logstash can't send data to its output it halts the pipeline.
Okey! From what the log says. What do you think might be the problem with the pipeline? Have you ever encountered something like this before? I've read up a little bit on shards, nodes and clusters.
I managed to get the status on my cluster and yes it's red.
"cluster_name" : "elk",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 51,
"active_shards" : 51,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 61,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
Again, the Elasticsearch logs should contain clues as to what's up with the cluster. There's no point in guessing what it could be.
Remind me, where could I find the elasticsearch logfile?
See earlier in the thread.
Apparently the elasticsearch folder is empty. What is the log called so I can search for it?
/var/log/elasticsearch is empty? Is your disk full?