LogStash Stops after a few hours

dirtdiver512 · June 9, 2015, 7:38pm

All,

I have a few devices sending logs to ELK. All are sending based on syslog going to specific ports and being indexed based on the port it goes to. All of it so far is just networking gear such as juniper and cisco. Everything seems to work great but after a few hours something stops. In kibana it shows no more events. I restart logstash and it all starts to work again. There is nothing in /var/log/logstash/logstash.log or .err.

Any ideas where it even start looking? The CPU is at about 5% peak and not all the memory is used with only about 1% of the drive used.

thanks

warkolm · June 9, 2015, 9:25pm

What version? What JVM?
What does your config look like?

dirtdiver512 · June 9, 2015, 10:13pm

java version "1.8.0_45"

Java(TM) SE Runtime Environment (build 1.8.0_45-b14)

Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode

logstash-core-1.5.0.rc4

basic config.

3 files do filtering and inputs based on port number

1 file that handles the output. Here is a sample of the input file and the output file:

input {

udp {
port => 5143
type => vtc
}

}

filter {

#############################################################################Auth Failures##################################

if [message] =~ "Authentication failure" {

 mutate {
    add_field => [ "vtc-type", "Auth_Failure" ]
    add_field => [ "log-type", "vtc" ]

  }

}

else if [message] =~ "User cannot be authenticated" {

 mutate {
    add_field => [ "vtc-type", "Auth_Failure" ]
    add_field => [ "log-type", "vtc" ]
  }

}

else if [message] =~ "Login attempt" and [message] =~ "FAILURE " {

 mutate {
    add_field => [ "vtc-type", "Auth_Failure" ]
    add_field => [ "log-type", "vtc" ]
  }

}

else if [message] =~ "Unauthenticated user" and [message] =~ "FAILURE " {

 mutate {
    add_field => [ "vtc-type", "Auth_Failure" ]
    add_field => [ "log-type", "vtc" ]
  }

}

################################################################################Auth Success#################################

else if [message] =~ "Starting session: shell" {

 mutate {
    add_field => [ "vtc-type", "Login_Success" ]
    add_field => [ "log-type", "vtc" ]
  }

}

else if [message] =~ "Recorded successful login" {

 mutate {
    add_field => [ "vtc-type", "Login_Success" ]
    add_field => [ "log-type", "vtc" ]
  }

}

else if [message] =~ "user ID" and [message] =~ "SUCCESS" {

 mutate {
    add_field => [ "vtc-type", "Auth_Success" ]
    add_field => [ "log-type", "vtc" ]
  }

}

OUTPUT##############################################################

output {
stdout { codec => rubydebug }

if [log-type] =="vtc" {
elasticsearch {
index => "logstash_vtc-%{+YYYY.MM.dd}"
host => "localhost"
}
}

else if [log-type] =="cisco" {
elasticsearch {
index => "logstash_cisco-%{+YYYY.MM.dd}"
host => "localhost"
}
}

else if [log-type] =="juniper" {
elasticsearch {
index => "logstash_juniper-%{+YYYY.MM.dd}"
host => "localhost"
}
}

else {

elasticsearch {
index => "logstash_unknown-%{+YYYY.MM.dd}"
host => "localhost"
}
}

}

warkolm · June 9, 2015, 10:32pm

There's since been a GA release of 1.5, I'd suggest that you upgrade to that first.

dirtdiver512 · June 9, 2015, 10:37pm

thanks....because i am fairly new to this, is there a "best way" to upgrade?

Also not sure if it is related but i am getting these from a cisco device:

"Received an event that has a different character encoding than you configured"

warkolm · June 9, 2015, 10:37pm

How did you install, deb/rpm or zip?

dirtdiver512 · June 9, 2015, 10:41pm

because i am half way lazy and wanted to make this repeatable, i created a script to install ELK. the logstash part is here:

echo 'deb http://packages.elasticsearch.org/logstash/1.5/debian stable main' | sudo tee /etc/apt/sources.list.d/logstash.list

sudo apt-get update
sudo apt-get install -y logstash

warkolm · June 9, 2015, 10:44pm

sudo apt-get install logstash should do it.

dirtdiver512 · June 11, 2015, 3:37pm

ok so i did the install and no issues in getting it upgraded. Still has the same issue. I had one device sending a lot of "Received an event that has a different character encoding than you configured." errors and fixed that. it seemed more stable but crashed again last night. the last log in logstash.log is the same as above, but with no host listed as to who is doing it. other than that no other errors.

dirtdiver512 · June 11, 2015, 5:12pm

I also replaced the file below as it seems related but still no go.....it is only staying up for a few minutes then hangs:

https://raw.githubusercontent.com/driskell/logstash/fe1cb7c91f13fda1661b9471dbcc4bd390b4d487/lib/logstash/pipeline.rb

dirtdiver512 · June 15, 2015, 11:17pm

Ok still having the same issue, here is what was in the logs. what is odd is it seems to be throwing an error but only under certain circumstances after some time:

{:timestamp=>"2015-06-11T07:57:02.735000-0800", :message=>"Error: Expected one of #, input, filter, output at line 324, column 1 (byte 4415) after "}
{:timestamp=>"2015-06-11T07:57:02.742000-0800", :message=>"You may be interested in the '--configtest' flag which you can\nuse to validate logstash's configuration before you choose\nto restart a running system."}

magnusbaeck · June 23, 2015, 5:54am

Is something restarting your Logstash daemon? I don't think the configuration reader code path is reached except during initialization.

suyograo · June 23, 2015, 6:22am

Are there any non LS config files in your /etc/logstash/conf.d dir? If so, thats the problem.

dirtdiver512 · June 24, 2015, 7:02pm

All,

Thanks for the feed back.I checked the /etc/logstash/conf.d directory and only few files there, all of which are .conf files. Also it crashed again, but the log files actually show nothing. Also I checked all the cron jobs running to make sure it was not restarting. Here are the last lines in all three logs:

.eer=
Jun 19, 2015 9:05:51 AM org.elasticsearch.cluster.service.InternalClusterService$UpdateTask run
INFO: [logstash-MbsVmLogAgg-1-1442407170-7960] added {[logstash-MbsVmLogAgg-1-1442407170-7968][OVMa1pHhRZWpqsJIkv_WQA][MbsVmLogAgg-1][inet[/10.0.2.40:9310]]{data=false, client=true},}, reason: zen-disco-receive(from master [[Raa of the Caves][2F51UGUFRZedylo470v86Q][MbsVmLogAgg-1][inet[/127.0.0.1:9300]]])

.log=
{:timestamp=>"2015-06-16T08:41:34.806000-0800", :message=>"SIGTERM received. Shutting down the pipeline.", :level=>:warn}

.stdout=
"message" => "<182>1 2015-06-20T15:20:07.652-08:00 xxx.xxx.xxx RPRM 5248 JserverAudit - INFO |http-443-37|ResourceManager_Audit_Log| USER_CHANGE (NOTICE): SUCCESS [+ TR74bd5] ~ Recorded successful login statistics for 1 [user ID: LOCAL\admin, source: xxx.xxx.xxx.xxx, destination: xxx.xxx.xxx.xxx]\n",
"@version" => "1",
"@timestamp" => "2015-06-20T23:20:09.132Z",
"type" => "vtc",
"host" => "xxx.xxx.xxx.xxx",
"vtc-type" => "Login_Success",
"log-type" => "vtc"

dirtdiver512 · June 24, 2015, 7:04pm

Also if i try and shut down the service it hangs and I have to kill the process so that I can restart it. This only happens when it "crashes".

dirtdiver512 · June 25, 2015, 5:39pm

OK so i turned on verbose logging via the -vv option. when it crashed, here is the last line on the .log log:

<14>2015 Jun 24 22: -admin-shell: %WAAS-PARSER-6-350232: CLI_LOG log_cli_command: show flash ", "@version"=>"1", "@timestamp"=>"2015-06-24T22:04:39.372Z", "type"=>"cisco", "host"=>"xxx.xxx.xxx.xxx"}, "log-type"]}>>]]}, :batch_timeout=>1, :force=>true, :final=>nil, :level=>:debug, :file=>"stud/buffer.rb", :line=>"207", :method=>"buffer_flush"}

asafyigal · July 7, 2015, 11:54am

We had the same issues with logstash and if you look it up you'll see that you are not alone facing these issues.

I can tell you that we ended up not using logstash at the end due to continuous stability issues that ended up taking too much of our time but here are a few pointers if you still want to use it.

Take a look at memory consumption - this is a big thing in logstash and though you might have a strong machine it sometimes dies from OutOfMemory and nothing if written to the log files in such case - the process just hangs.
Take a look at the max message size and how you are sending the logs to Logstash - if the events are longer then the buffer (Which cannot be extended beyond a certain value) Logstash issues a socket disconnect and most shippers just give up and not try to reconnect until you restart logstash.

Hope that help and if it doesn't I'm sure you can find a stable solution for syslog.

-- Asaf.

dirtdiver512 · July 9, 2015, 6:40pm

Well i upgraded to the latest version and so far so good for over a week. I would be interested in knowing what you went to

Topic		Replies	Views
Logstash stops processing the logs after sometime for tcp input Logstash	5	1551	April 15, 2019
Logstash works for a few minutes then stops Logstash	9	890	March 20, 2018
Logstash stops receiving logs Logstash	8	1137	November 15, 2021
Logstash silently stops processing events and does not respond to SIGTERM Logstash	6	3841	July 6, 2017
Logstash does not deliver Logs anymore Logstash	3	256	August 22, 2019

LogStash Stops after a few hours

Related topics