Logstash setup for syslog udp input

Hi guys!
I need your help in advanced setting up for ELK server.

I have installed ELK stack into Ubuntu 14.04

Basically I setup logstash server with filebeats and successfully configured logstash filter for parsing logs
I can see all logs came from filebeats in kibana.

Now I want to switch log collecting from filebeats directly into rsyslog input

So I setup one of my application servers to syslog output into my logstash server
Here is my configuration for nginx application node:
access_log syslog:server=my-elk-server.domain:1514 mylog;

Logstash configuration I have separated into different configs with prefix input- filter- output-

Next my input files looks like this:
/etc/logstash/conf.d/nput.filebeat.conf

input {
	beats {
		port => 5044
		type => "filebeat_log"
	}
}

/etc/logstash/conf.d/input-rsyslog.conf

input {
    udp {
	port => 1514
    }
    tcp {
	port => 1514
    }
}

Filter configuration is a little complicated, but I believe that for experienced members it will not be a problem

/etc/logstash/conf.d/filter-prod-filebeats.conf

filter {
        grok {
	    match => [
		"message",
		'%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: (?:%{IP:remote_client_ip}|-) \[%{IP:client_ip}\] - (?:%{USER:protocol}|-) \[%{HTTPDATE:time_local}\] "%{IPORHOST:httphost}" "(?:%{WORD:verb} %{NOTSPACE:request})(?: HTTP/%{NUMBER:httpversion})" %{NUMBER:response_code} %{NUMBER:request_length} (?:%{NUMBER:bytes_sent}|-) (?:%{NUMBER:request_time}|-) %{QS:http_device} %{QS:http_user_agent}'
	    ]
    	    add_tag => [ "nginx_access_log", "prod" ]
	    remove_tag => [ "_grokparsefailure" ]
	}

	# production error.log processing
	grok {
	    match => [
		"message",
		'%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: %{GREEDYDATA:error}host: "%{IPORHOST:httphost}"'
	    ]
    	    add_tag => [ "nginx_error_log", "prod", "error" ]
	}

        grok {
	    match => [
    		"message",
		'%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: %{GREEDYDATA:error}'
	    ]
	    add_tag => [ "nginx_error_log", "prod", "error" ]
        }

        date { 
	    match => [ "timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ] 
	    target => "@timestamp"
        }
}

Output configured as elasticsearch and file
/etc/logstash/conf.d/output-elasticsearch.conf

output {
	elasticsearch {
		hosts => ["localhost:9200"]
		sniffing => true
		manage_template => false
		index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
		document_type => "%{[@metadata][type]}"
	}
}

/etc/logstash/conf.d/output-file.conf

output {
    file {
       path => "/var/log/logstash/output_line.log"
       codec => "line"
    }
}

So when I attached syslog input. I can see new records in output file together with filebeats entries, but udp inputs has a little difference

2016-05-20T06:40:20.842Z 172.31.10.228 <190>May 20 06:40:20 node04 nginx: 217.119.17.14 [172.11.22.12] - https [20/May/2016:06:40:20 +0000] "customer.example.com" "PUT /command/ HTTP/1.1" 200 1050 302 0.192 "{017ef056-1aaa-4712-a8e7-7584612aea3b}" "M Windows Query: 2"

2016-05-20T06:40:12.000Z node01 May 20 06:40:12 node03 nginx: 197.17.12.19 [172.11.22.110] - - [20/May/2016:06:40:12 +0000] "customer.example.com" "PUT /command/ HTTP/1.1" 200 440 325 0.252 "" "M Windows Query: 4"

There are an extra sequence <190> after timestamp in log which come from udp
And there is a problem that this lines from udp have not enter in kibana and in elasticsearch
Only lines came from filebeats (without <190>) are present in kibana

My questions are:
What I had wrong with UDP (syslog) input or filter?
Why does UDP input shown in file with extra <190> ?
And most important: why does UDP input not parsed with grok pattern if there are the same input from different sources ?
And in the and if filter does not work, can I save into elasticsearch at least unparsed data (something like fallback action) ?

Thank you in advance for any suggestions and advice!

What I had wrong with UDP (syslog) input or filter?
Why does UDP input shown in file with extra <190> ?

"<190>" is the facility and severity marker that's part of the syslog protocol. It's not present in on-disk files. You can use the syslog_pri filter to convert the number into something human readable.

And most important: why does UDP input not parsed with grok pattern if there are the same input from different sources ?

Because your grok filter doesn't take the facility/severity token into account?

And in the and if filter does not work, can I save into elasticsearch at least unparsed data (something like fallback action) ?

That's what it does by default. About the only occasion where Logstash drops an event (except when you explicitly request it to, obviously) is for mapping errors, i.e. when ES can't receive the event because because its contents isn't compatible with the index's mappings.

1 Like

Thanks a lot! You're my Hero!

It was a confusing with output using @metadata which was absent during UDP incoming requests.

Fixed with adding metadata fields for output section:

syslog_pri {
    add_field => { "[@metadata][type]" => "syslog" }
    add_field => { "[@metadata][beat]" => "syslog" }
}

So final config looks like next:

input {
    udp {
	    port => 1514
	    type => "syslog"
    }
  }
    filter {
        syslog_pri {
            add_field => { "[@metadata][type]" => "syslog" }
            add_field => { "[@metadata][beat]" => "syslog" }
        }

    if [type] == "syslog" {
        # production access.log processing
        grok {
            match => [
                "message",
                '<190>%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: (?:%{IP:http_x_forwarded_for}|-) \[%{IP:remote_addr}\] - (?:%{USER:remote_user}|-) \[%{HTTPDATE:time_local}\] "%{IPORHOST:httphost}" "(?:%{WORD:verb} %{NOTSPACE:request})(?: HTTP/%{NUMBER:httpversion})" %{NUMBER:response_code} %{NUMBER:request_length} (?:%{NUMBER:bytes_sent}|-) (?:%{NUMBER:request_time}|-) %{QS:http_device} %{QS:http_user_agent}'
            ]
            break_on_match => true
            add_tag => [ "nginx_access_log", "prod" ]
            remove_tag => [ "_grokparsefailure" ]
        }

        # production error.log processing
        grok {
            match => [
            "message",
            '%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: %{GREEDYDATA:error}host: "%{IPORHOST:httphost}"'
            ]
                add_tag => [ "nginx_error_log", "prod", "error" ]
        }

        grok {
            match => [
                "message",
            '%{SYSLOGTIMESTAMP:timestamp} %{NOTSPACE:node_name} %{HTTPDUSER:ident}: %{GREEDYDATA:error}'
            ]
            add_tag => [ "nginx_error_log", "prod", "error" ]
        }

      }
  }
  output {
	elasticsearch {
		hosts => ["localhost:9200"]
		sniffing => true
		manage_template => false
		index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
		document_type => "%{[@metadata][type]}"
	}
  }

Hope it might help somebody else

2 Likes