What is the number prefix in less-than/greater-than signs in the Message Field, before Date

Hello, I'm new to the ELK stack and am having trouble figuring out something that's coming through in all my logs.

The 'message' field stars with a <##>MMM DD HH:MM:SS

I cannot for the life of me figure out what the <##> is supposed to be, why its attached to the month without a space, etc.

What is the term for that field, and what information is it supplying? Is it supposed to be there? Will it cause any filters to fail? My current date filter is failing, and I thought my grok filter was failing because of it too until I used a simpler grok and started to build up.

Also, my message field repeats after a comma for some reason.

Here are samples of the output:

tags:_dateparsefailure host:10.1.3.66 syslog_timestamp:Aug 9 09:44:10 @timestamp:Aug 9, 2019 @ 09:44:11.255 syslog_facility:user-level syslog_facility_code:1 @version:1 type:syslog syslog_severity:notice syslog_severity_code:5 received_at:2019-08-09T13:44:11.255Z received_from:10.1.3.66 message:<4>Aug 9 09:44:10 ("U7LT,44d9e7fc371e,v3.9.42.9152") kernel: [12250717.020000] rx_clear=100, rx_frame=0, tx_frame=0 , ("U7LT,44d9e7fc371e,v3.9.42.9152") kernel: [12250717.020000] rx_clear=100, rx_frame=0, tx_frame=0 _id:pZ6fdmwBYAcnd78aXWpf _type:_doc _index:syslog-2019.08.09 _score: -

tags:_dateparsefailure host:192.168.100.215 syslog_timestamp:Aug 9 09:44:08 @timestamp:Aug 9, 2019 @ 09:44:08.172 syslog_facility:user-level syslog_facility_code:1 @version:1 type:syslog syslog_severity:notice syslog_severity_code:5 received_at:2019-08-09T13:44:08.172Z received_from:192.168.100.215 message:<86>Aug 9 09:44:08 U7PG2,f09fc2c8513e,v4.0.42.10433: dropbear[31972]: Exit before auth: Exited normally, U7PG2,f09fc2c8513e,v4.0.42.10433: dropbear[31972]: Exit before auth: Exited normally _id:Q56fdmwBYAcnd78aUWlV _type:_doc _index:syslog-2019.08.09 _score: -

Those are my syslogs, running through this config:

input {
tcp {
port => 1514
type => syslog
}
udp {
port => 1514
type => syslog
}
}
filter {
if [type] == "syslog" {
grok {
#match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{GREEDYDATA:message}" }

add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "syslog-%{+YYYY.MM.dd}"
}
}

I have a second input pipeline that takes our firewall messages that have a lot more and different formatting. I am using kv and csv filters. There is no parse failure, but I still have this <##> field connected to the month. Because I am using KV here and my regex skills are probably not as developed as they could be I end up with this:

type:sonicwall host:192.168.1.99 @timestamp:Aug 9, 2019 @ 10:02:46.410 data.vpnpolicy:Brackney Tunnel data.time:2019-08-09 07:02:46 data.dstip:192.168.200.5 data.fw:<WAN_IP> data.c:262144 data.n:25456292 data.srciface:X1 data.m:98 data.<134>id:ea_firewall data.srcip:10.1.1.170 data.gcat:4 data.proto:tcp/22 data.pri:6 data.srcport:41067 data.dstport:22 data.dstiface:X0 data.sn:18B169D1CA60 data.sent:60 @version:1 _id:_qCwdmwBYAcnd78aYXH0 _type:_doc _index:sonicwall-2019.08.09 _score: -

input {
tcp {
port => 5514
type => sonicwall
}
udp {
port => 5514
type => sonicwall
}
}
filter {
if [type] == "sonicwall" {
# grok {
# match => { "message" => "%{GREEDYDATA:message}" }
# }
kv {
source => "message"
target => "data"
}
# Sonicwall src field
csv {
source => "[data][src]"
separator => ":"
columns => ["srcip","srcport","srciface","srcname"]
target => "data"
}
# Sonicwall dst field
csv {
source => "[data][dst]"
separator => ":"
columns => ["dstip","dstport","dstiface","dstname"]
target => "data"
}
# Sonicwall natsrc field
csv {
source => "[data][natSrc]"
separator => ":"
columns => ["natsrcip","natsrcport"]
target => "data"
}
# Sonicwall natdst field
csv {
source => "[data][natDst]"
separator => ":"
columns => ["natdstip","natdstport"]
target => "data"
}
mutate
{
remove_field => [ "message" ]
remove_field => [ "[data][src]" ]
remove_field => [ "[data][dst]" ]
remove_field => [ "[data][natSrc]" ]
remove_field => [ "[data][natDst]" ]
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "sonicwall-%{+YYYY.MM.dd}"
}
}

That is a syslog PRI field.

What does the input look like?

Ahh! Thank you. I was wondering if that was the case, but I couldn't make the number match with anything. Now that I see its a (Facility x 8 + Severity) that makes much more sense.

Is there any easier way to inspect my input other than creating a very simple pipeline with only input and output? I haven't quite figured that out either, or if that method is even 100% accurate.

That would work. Use

output { stdout { codec => rubydebug } }

Ok, here are some samples:

{
"message" => "<4>Aug 9 11:20:28 ("U7LT,44d9e7fc371e,v3.9.42.9152") kernel: [12256494.680000] rx_clear=99, rx_frame=0, tx_frame=0\n",
"type" => "syslog",
"host" => "10.1.3.66",
"@version" => "1",
"@timestamp" => 2019-08-09T15:20:28.954Z
}

{
"message" => "<86>Aug 9 11:20:29 U7PG2,f09fc2c852dc,v4.0.42.10433: dropbear[7518]: Child connection from 10.1.1.170:40147",
"type" => "syslog",
"host" => "192.168.100.210",
"@version" => "1",
"@timestamp" => 2019-08-09T15:20:29.055Z
}

and a few that come from the SonicWall

{
"@version" => "1",
"@timestamp" => 2019-08-09T15:27:35.169Z,
"host" => "10.1.1.1",
"type" => "sonicwall",
"message" => "<134> id=firewall sn=18B169CDF644 time="2019-08-09 11:27:35" fw=<WAN_IP> pri=6 c=1024 gcat=6 m=537 msg="Connection Closed" srcMac=2c:41:38:ab:10:e6 src=10.1.1.205:49543:X0 srcZone=LAN natSrc=<WAN_IP>:7546 dstMac=f4:b5:2f:0e:74:4c dst=93.184.220.29:80:X1 dstZone=WAN natDst=93.184.220.29:80 proto=tcp/http sent=508 rcvd=1034 spkt=6 rpkt=4 cdur=62616 rule="12 (LAN->WAN)" app=49175 appName="General HTTP" n=36206708"
}

So, it seems the Sonicwall has a space, just about everything else (So far I've only pointed our firewalls, switches [Cisco, HP, Netgear and Ubiquiti] and APs [Ubiquiti] to ELK) does not. Are there standard filters that may account for this (and extrapolate a facility and severity from that number)?

There is a syslog_pri filter.

For the syslog messages I would start with

dissect { mapping => { "message" => "<%{pri}>%{ts} %{+ts} %{+ts} %{restOfLine}" } }

For the sonicwall messages kv will ignore the <pri> so your filter configuration looks good to me.

Thank you so much for the help!

So with the sonicwall, it is not ignoring the PRI. It's treating it as part of the ID= header, so I have fields that show up like:

image

In 7.3.0 I see it being ignored. Still, easy enough to use mutate+gsub to remove it.