Logstash UDP Message Missing Beginning

Hi,

I am forwarding a syslog from a device to Logstash using port 5514. However, my syslog grok,which works great for any filebeat shipped syslogs, fails for the UDP forwarded syslogs. I have tested the grok using the grok debugging tools and it is successful if I copy the log directly from the device and use the debugger. So there isn't a problem with the grok. The problem I found is that somehow when I am getting my UDP logs the message field is the missing the syslog timestamp and everything before the actual message. This thus creates a failed grok parse.

So my question is why is the log line missing the timestamp etc.? Why was it stripped. I am not using the syslog plugin for input, so should be no processing done to the log line until we get to filtering.

Here is my input setup:

input {
    tcp {
    port => 5514
    type => "syslog"
    codec => "plain"
    add_field => {"[@metadata][beat]" => "logstash" "[@metadata][type]" => "syslog"}
    tags => ["tcp"]
    }
    udp {
    port => 5514
    type => "syslog"
    codec => "plain"
    add_field => {"[@metadata][beat]" => "logstash" "[@metadata][type]" => "syslog"}
    tags => ["udp"]
    }
}

here is my filter:

filter {
    if [type] == "syslog" {
    grok {
    match => {
    "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp}%{SPACE}%{SYSLOGHOST:syslog_hostname}%{SPACE}%{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?:%{SPACE}%{GREEDYDATA:syslog_message}"
    }
    add_field => [ "received_at", "%{@timestamp}" ]
    add_field => [ "received_from", "%{host}" ]
    break_on_match => false
    }
    syslog_pri { }
    date {
    match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
    if [syslog_program] == "adclient" {
    grok {
     match => {
    "syslog_message" => "INFO\s+AUDIT_TRAIL\|Centrify Suite\|(?<ad_program>[\w\s]+)\|1\.0\|%{POSINT:ad_code}\|(?<ad_action>%{GREEDYDATA})\|\d\|user=%{USERNAME:ad_user}\(type\:ad,\w+\@[\w\.]+\)\spid=%{POSINT:ad_pid}\sutc=%{INT:ad_utc}\scentrifyEventID=%{POSINT:ad_event_id}\sstatus=%{WORD:ad_status}\sservice=(?<ad_service>%{GREEDYDATA})\stty=(?<tty>[%{WORD}%{SPACE}]+)\s.*client=%{HOSTNAME:ad_client}.*"
    }
    break_on_match => false
    }
    geoip {
    source => "ad_client"
    target => "geoip"
    add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}"]
    add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
    }
    mutate {
    convert => [ "[geoip][coordinates]", "float"]
    }
    }
    if [syslog_program] == "logger"{
    grok {
    match => {
    "syslog_message" => "<%{POSINT}>%{POSINT}%{SPACE}%{NUMBER}%{SPACE}%{USERNAME:meraki_organization}%{SPACE}%{WORD:meraki_roles}%{SPACE}%{GREEDYDATA:meraki_message}"
    }
    add_tag => "meraki"
    }
    if [meraki_roles] == "flows" {
    grok {
    match => {
    "meraki_message" => "src=%{HOSTNAME:meraki_src_client}%{SPACE}dst=%{HOSTNAME:meraki_dst_client}%{SPACE}(?:mac=%{MAC:meraki_src_mac}%{SPACE})?protocol=%{WORD:meraki_protocol}%{SPACE}sport=%{POSINT:meraki_src_port}%{SPACE}dport=%{POSINT:meraki_dst_port}%{GREEDYDATA}"
    }
    add_tag => "flows"
    }
    }
    }
    }
}

Here is the an example message that fails when it shows up in Kibana:

<134>1 1472063783.811656753 studio_appliance flows src=192.168.20.183 dst=72.22.185.201 mac=68:5B:35:92:61:3E protocol=tcp sport=63894 dport=80 pattern: allow all

Here is the same message when from the same log when using Papertrail:

Aug 24 14:36:23 209.160.216.98 logger: <134>1 1472063783.811656753 studio_appliance flows src=192.168.20.183 dst=72.22.185.201 mac=68:5B:35:92:61:3E protocol=tcp sport=63894 dport=80 pattern: allow all

The following is missing:

Aug 24 14:36:23 209.160.216.98 logger:

Well it's UDP, it's not guaranteed.
Are you sure the whole packet is arriving? Can you use TCP?

I don't think it's UDP thing as the timestamp is correct and the host is resolving in logstash. Additionally, as I've said the message shows up correctly in Papertrail. Our logstash server is on the same internal network as the firewall forwarding events. I can't force the firewall send TCP.

I'm not saying I'm missing events, as I understand that's part of tradeoff with UDP but why is the message not containing the timestamp or host info but papertrail is showing it.

It seems Logstash is doing some processing to it before it gets to filtering. Any ideas? I have a filter for syslog but seems it's failing that filter before it even gets there because the message does not conform to syslog anymore. Any ideas?