Logstash not ingesting syslog


(Sfchrisgleason) #1

Good evening. This is my first post here, so forgive me if I'm missing some rules here. I'll try to be as concise as possible:

Issue:

I am not able to get logstash to ingest Syslog files from servers using the syslog sender.
I currently have some servers sending in using filebeat, and can see those event stored properly.

What I've tried:

  • I tried different ports, TCP and UDP in the logstash config
  • I've verified nothing is running on 514 TCP/UCP if logstash is stopped
  • I've tried Syslog input and TCP/UDP inputs.
  • I've set root as the user to launch logstash in default/logstash
  • I've used setcap to allow java to fire off privliedges ports (in the case of UDP I just can't get it to listen)
  • I've run captures to see if the server is listening and I'm seeing all the traffic inbound on 514 both TCP and UDP from different hosts coming in.
  • I've set LS_JAVA_OPTS="-Djava.net.preferIPv4Stack=true" in /default/logstash
  • I've tried adding and removing the elasticsearch output for all my conf.d files.

Probably more as it's been a few days now, but my elbows hurt so I'm trying to wrap this up...

My versions:

Logstash: 2.3
Java: java-1.8.0-openjdk-amd64
OS: Debian Jessie (8.6)

What I'm hoping to get out of this:

I am trying to compare ELK to Splunk and am finding ELK worked fine on a different build, but slowly disintegrated. I was using this tutorial to do the initial build. It worked fine, then when I tried to build on another machine it never worked properly again.

While Splunk is expensive, I've found it considerably easier to setup. I'm hoping this is just something simple I'm missing. I'm relatively well versed in Linux, but by no means an expert, so any help would be appreciated.

I have a growing feeling it's somehow to do with the elasticsearch output...

Some errors I encountered along the way

When setting up input UDP I am getting this error and have yet to get the system listening on 514/UDP:

{:timestamp=>"2016-12-09T16:22:45.607000-0800", :message=>"UDP listener died", :exception=>#<IOError: closed stream>, :backtrace=>["org/jruby/RubyIO.java:3682:in select'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-2.0.5/lib/logstash/inputs/udp.rb:77:inudp_listener'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-2.0.5/lib/logstash/inputs/udp.rb:50:in run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:334:ininputworker'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:328:in `start_input'"], :level=>:warn}

My config files:

02-beats-input.conf

input {
  beats {
    port => 5044
  }
}

10-syslog-filter.conf

input {
  syslog {
    port => 514
  }
}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

30-elasticsearch-output.conf

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

That's about it. Thanks!

Chris


(Mark Walkom) #2

Unless Logstash is running as root it won't listen on that port as it's considered privileged, try a higher one (above 1024).


(Sfchrisgleason) #3

Thanks Mark.

I did forget to mention that I set the LS_USER to root in default/logstash. It is running as root.

I also tried several higher level UDP ports and got the same message:

{:timestamp=>"2016-12-09T16:22:45.607000-0800", :message=>"UDP listener died", :exception=>#, :backtrace=>["org/jruby/RubyIO.java:3682:in select'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-2.0.5/lib/logstash/inputs/udp.rb:77:inudp_listener'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-2.0.5/lib/logstash/inputs/udp.rb:50:in run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:334:ininputworker'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.2.4-java/lib/logstash/pipeline.rb:328:in `start_input'"], :level=>:warn}


(Sfchrisgleason) #4

Also just remembered, I have a TCP server listening on 514 and sent syslog from an ESX box to it on tcp://serverIP:514. So I'm not seeing them using TCP either, but the port is listening and the socket is established.

I can see the packets coming in on the packet capture, but am not seeing them at all in elasticsearch/kibana...


(Sfchrisgleason) #5

Also just some more random information to help confuse things :grin:

I only get the message above if I explicitly define a UDP input (doesn't matter what port I tried 514, 1514, 10514 and more).

However if I choose syslog output, then I don't get the message. I see the same output from lsof and netstat either way though, that there is a TCP listener and a failed UDP one:

sudo lsof -i :514
COMMAND   PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
java    24658 root   49u  IPv4 4245798      0t0  TCP *:shell (LISTEN)
java    24658 root   50u  IPv4 4242432      0t0  UDP *:syslog
java    24658 root   73u  IPv4 4244946      0t0  TCP somehostname:shell->X.X.X.X:33531 (ESTABLISHED)

and

sudo netstat -nl | grep 514
tcp        0      0 0.0.0.0:514             0.0.0.0:*               LISTEN
udp        0      0 0.0.0.0:514             0.0.0.0:*

The output is the same for either output I use, but the error only pops up in the logstash logs if I specify a udp output.

Here is the output when using a syslog listener with 1514 as the port:

sudo netstat -nl | grep 1514
tcp        0      0 0.0.0.0:1514            0.0.0.0:*               LISTEN
udp        0      0 0.0.0.0:1514            0.0.0.0:*


sudo lsof -i :1514
COMMAND   PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
java    25667 root   27u  IPv4 4266066      0t0  TCP *:1514 (LISTEN)
java    25667 root   50u  IPv4 4266067      0t0  UDP *:1514

Turning on debug logging spits out a whole ton of stuff, but I've no idea how to digest it or know what to look for other than than standard 'error' 'fail' etc. Let me know if there is something specific that would help debug this...


(Sfchrisgleason) #6

Here is some sanitized information from the debug level of the log from a syslog event that was sent in via udp:

{:timestamp=>"2016-12-12T16:38:22.690000-0800", :message=>"Date filter looking for field", :type=>"syslog", :field=>"syslog_timestamp", :level=>:debug, :file=>"logstash/filters/date.rb", :line=>"314", :method=>"filter"}
{:timestamp=>"2016-12-12T16:38:22.690000-0800", :message=>"filter received", :event=>{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, :level=>:debug, :file=>"(eval)", :line=>"104", :method=>"filter_func"}
{:timestamp=>"2016-12-12T16:38:22.691000-0800", :message=>"Running grok filter", :event=>#<LogStash::Event:0x7489ad7a @metadata_accessors=#<LogStash::Util::Accessors:0x2b1ce148 @store={}, @lut={}>, @cancelled=false, @data={"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x7d590795 @store={"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, @lut={"type"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, "type"], "host"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, "host"], "[type]"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X"}, "type"]}>>, :level=>:debug, :file=>"logstash/filters/grok.rb", :line=>"277", :method=>"filter"}
{:timestamp=>"2016-12-12T16:38:22.692000-0800", :message=>"Event now: ", :event=>#<LogStash::Event:0x7489ad7a @metadata_accessors=#<LogStash::Util::Accessors:0x2b1ce148 @store={}, @lut={}>, @cancelled=false, @data={"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x7d590795 @store={"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, @lut={"type"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, "type"], "host"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, "host"], "[type]"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, "type"], "message"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, "message"], "tags"=>[{"message"=>"<37>Dec 13 00:38:20 netlogger:              To: <sip:FQDN:5060;transport=udp>;tag=34899995ff532da86b6cd2acdffe0e68.f85b\n", "@version"=>"1", "@timestamp"=>"2016-12-13T00:38:22.138Z", "type"=>"syslog", "host"=>"X.X.X.X", "tags"=>["_grokparsefailure"]}, "tags"]}>>, :level=>:debug, :file=>"logstash/filters/grok.rb", :line=>"292", :method=>"filter"}

This is just a sample of one of the messages. It's clearly UDP, but it's also hitting a grokparse failure, so not sure if that's part of it...


(Sfchrisgleason) #7

So I added an if statement to the syslog output that filtered out all '_grokparsefailure' tagged messages and dumped them to a /var/log file per the article here.

It appears the ESX messages are being caught as well, which I find odd as I used the exact same grok filter on the very same ESX box before and it worked perfectly fine.

I'm not a grok expert, but I'm reading up on it now. Would this cause all messages to fail being sent to elasticsearch if the grok parsing fails?


(Sfchrisgleason) #8

LOL. I do find this message funny though (which was gathered on the ELK server via TCP port 514):

{"message":"<14>2016-12-13T06:16:23.450Z esx4.FQDN vobd: [UserLevelCorrelator] 2359134946018us: [vob.user.vmsyslogd.remote.failure] The host "ELKSERVER:514" has become unreachable. Remote logging to this host has stopped.","@version":"1","@timestamp":"2016-12-13T06:19:23.861Z","host":"esx4.FQDN","port":14226,"type":"syslog","tags":["_grokparsefailure"],"syslog_severity_code":5,"syslog_facility_code":1,"syslog_facility":"user-level","syslog_severity":"notice"}


(Sfchrisgleason) #9

Soooo. Since everyone here is SOOOOOO talkative, I went ahead and solved my problem...

JK. It's not solved, but I did see that it is in fact ingesting the logs and putting them in logstash, it's just using a metadata index...

Running:

http://elasticIP:9200/_cat/indices?v

Showed me that there were in fact extra indexes, one called:

metadata beat

yellow open   %{[@metadata][beat]}-2016.12.09   5   1     665057            0     85.8mb         85.8mb 
yellow open   winlogbeat-2016.11.22             5   1        376            0    356.4kb        356.4kb 
yellow open   winlogbeat-2016.11.23             5   1       1480            0    912.9kb        912.9kb 
yellow open   winlogbeat-2016.11.20             5   1        356            0    371.8kb        371.8kb 
yellow open   winlogbeat-2016.11.21             5   1        360            0    334.6kb        334.6kb 
yellow open   winlogbeat-2016.11.26             5   1        322            0      277kb          277kb 
yellow open   winlogbeat-2016.11.27             5   1        364            0    365.6kb        365.6kb 
yellow open   winlogbeat-2016.11.24             5   1        330            0    278.1kb        278.1kb 
yellow open   winlogbeat-2016.11.25             5   1        344            0    288.9kb        288.9kb 
yellow open   winlogbeat-2016.11.30             5   1       1936            0      1.3mb          1.3mb 
yellow open   winlogbeat-2015.12.10             5   1        162            0    345.4kb        345.4kb 
yellow open   winlogbeat-2015.12.15             5   1        153            0    271.7kb        271.7kb 
yellow open   winlogbeat-2016.12.09             5   1        348            0    256.9kb        256.9kb 
yellow open   winlogbeat-2015.12.16             5   1       1830            0      1.2mb          1.2mb 
yellow open   winlogbeat-2015.12.17             5   1        153            0    296.2kb        296.2kb 
yellow open   winlogbeat-2016.12.07             5   1       1482            0    953.2kb        953.2kb 
yellow open   winlogbeat-2015.12.18             5   1        144            0    218.6kb        218.6kb 
yellow open   winlogbeat-2016.12.08             5   1        334            0    283.3kb        283.3kb 
yellow open   winlogbeat-2015.12.11             5   1        180            0    295.6kb        295.6kb 
yellow open   winlogbeat-2015.12.12             5   1         90            0    223.5kb        223.5kb 
yellow open   winlogbeat-2015.12.13             5   1        162            0    292.2kb        292.2kb 
yellow open   winlogbeat-2015.12.14             5   1        174            0    346.1kb        346.1kb 
yellow open   winlogbeat-2016.12.01             5   1        386            0    381.1kb        381.1kb 
yellow open   %{[@metadata][beat]}-2016.12.14   5   1      21972            0      5.7mb          5.7mb 
yellow open   winlogbeat-2016.12.02             5   1        340            0    295.6kb        295.6kb 
yellow open   %{[@metadata][beat]}-2016.12.11   5   1     999166            0    227.3mb        227.3mb 
yellow open   winlogbeat-2016.12.05             5   1        350            0      339kb          339kb 
yellow open   winlogbeat-2015.12.19             5   1        111            0    247.2kb        247.2kb 
yellow open   %{[@metadata][beat]}-2016.12.10   5   1     898239            0    206.3mb        206.3mb 
yellow open   winlogbeat-2016.12.06             5   1        348            0    332.5kb        332.5kb 
yellow open   winlogbeat-2016.12.03             5   1        348            0    268.3kb        268.3kb 
yellow open   %{[@metadata][beat]}-2016.12.13   5   1     600078            0    137.9mb        137.9mb 
yellow open   winlogbeat-2016.12.04             5   1        384            0    346.3kb        346.3kb 
yellow open   %{[@metadata][beat]}-2016.12.12   5   1    1048682            0    237.7mb        237.7mb 
yellow open   winlogbeat-2016.12.12             5   1     172074            0    199.9mb        199.9mb 
yellow open   winlogbeat-2016.09.30             5   1        198            0    228.4kb        228.4kb 
yellow open   winlogbeat-2016.12.13             5   1     113974            0    137.2mb        137.2mb 
yellow open   winlogbeat-2016.12.10             5   1        356            0    412.7kb        412.7kb 

So I added that index %{[@metadata][beat]}-* and boom there were my syslog events in the kibana dashboard.

So I guess the remaining question I have is, is this due to grok parsing failures? Is this a default index that is used if parsing fails? Sorry if my questions are a bit rudimentary, but I'm trying to understand why I'm getting this result, instead of everything being sent to the default index as it was before.

Thanks!


(Sfchrisgleason) #10

Aaaaand, I just realized that it's creating that index due to the output I created in the syslog output for elasticsearch that specifically specifies that index (Which it obviously can't parse for syslog?)

index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

I tried created a if else for the elasticsearch output and successfully had it put the data into a less crazily named index if grokparsefailure was in the tags, but decided to just go back to letting it put the data into the oddly named index for now.

Any idea why data passed by filebeat goes into the filebeat index, and syslog doesn't translate the index name and goes into the munged one?


(Mark Walkom) #11

What beat is creating that index?

We have day jobs too :slight_smile:


(Sfchrisgleason) #12

lol. That was my poor attempt at humor :wink:

The items that are winding up in that index are the pure syslog ones. The events I thought I were1 missing in the first place.

It seems that if the syslog events aren't matching any grok pattern, then they don't interpret the index field, but that's just me speculating why.

I also realized that I updated the files probably 50 times since I originally posted them, but it's the syslog input in the 10-syslog-filter file that created it I believe.


(Mark Walkom) #13

Are they tagged _grokparsefailure in ES?


(Sfchrisgleason) #14

Yup. All of them.


(Mark Walkom) #15

Ok, you may want to put a conditional in there to pipe them into a file at least, then you can do something to fix em and they won't end up in ES.


(Sfchrisgleason) #16

Yup. I did that in this post.

I actually DO want them in ES. I just didn't know they were being sent to the non-default index, and it took a while to figure that out.

At this point really my only questions are what could be causing the Grok parsing failure. I used this exact grok filter on this exact server using nearly the same ELK build a month ago and it worked totally fine for three different syslog senders and I didn't see this.

Any insight would be hugely appreciated as grok makes my head hurt a bit, but at this point my initial problem is solved.

Also why when these messages are being sent the aren't interpreting:

index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

When something using filebeat/winlogbeat hits that output it inteprets the index as winlogbeat-2016-12-12 or filebeat-2016-12-12, but when the syslog messages hit it, they literally interpret the first half as %{[@metadata][beat]}-2016-12-12


(system) #17

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.