Get timestamp from "embedded" syslog entry

Hi.

We have just started using the ELK stack, so please excuse the noobiness of these questions :slight_smile: and we have a situation where we have to use syslog to forward the logs from a SOLR server.
The format of the lines in the SOLR logfile is what appears to be syslog like.
Here is an example of a log line:
2019-02-20 11:31:10.626 INFO (commitScheduler-20-thread-1) [ ] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}

But as we have to forward this using rsyslog, it becomes encapsulated in a second syslog format and when parsed by logstash syslog module it appears to take the timestamp from the transmitted syslog. The above complete logline becomes the message.

Here is the filter configuation that I have tried using:
filter {
if [type] == "syslog" {
if [logsource] in "solr2,SOLR3,solr1" {
grok {
match => { "message" => "%{SYSLOGHOST:syslog_program} %{TIMESTAMP_ISO8601} %{LOGLEVEL:loglevel} %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "timestamp", "yyyy-mm-dd HH:mm:ss.SSS" ]
timezone => "UTC"
}
}
}
}

How would you extract the data in a correct fashion?
I am currently getting _dateparsefailure. Why is that?

You say that the message that is output by the syslog input matches the line you show, but that starts with a timestamp, and your grok pattern expects the message to start with a SYSLOGHOST. Can you remove the filter and show us what a message looks like with this output?

output { stdout { codec => rubydebug } }

What are you trying to do with this line?

if [logsource] in "solr2,SOLR3,solr1"
1 Like

Thank you for replying.
Sorry, I pasted the message that was returned by the GROK.
Here is output without filter:

{
"severity" => 5,
"@version" => "1",
"type" => "syslog",
"message" => "SOLR 2019-02-20 14:22:59.859 INFO (commitScheduler-19-thread-1) [ ] o.a.s.u.DirectUpdateHandler2 end_commit_flush",
"logsource" => "solr2",
"facility_label" => "local0",
"facility" => 16,
"host" => "172.17.0.1",
"timestamp" => "Feb 20 15:22:59",
"priority" => 133,
"severity_label" => "Notice",
"@timestamp" => 2019-02-20T15:22:59.000Z
}

My second IF statement is because I may have several different formats of syslog comming in, so I wanted to have different groks.

You are discarding the timestamp that that pattern matches. Is that what you want? If you retain it using %{TIMESTAMP_ISO8601:ts} then you could parse that using this (note MM for month, not mm)

match => [ "ts", "yyyy-MM-dd HH:mm:ss.SSS" ]

Where does the timestamp field come from? You can parse that using

match => [ "timestamp", "MMM dd HH:mm:ss" ]

Note that timestamp does not contain a year, so java will guess which year you want and sometimes its guess may be wrong.

if [logsource] in "solr2,SOLR3,solr1" {

will work, but it testing whether logsource is a substring of the RHS. So if logsource is equal to "r2,SO" it will test true. I would use an array membership test

if [logsource] in [ "solr2", "SOLR3", "solr1" ] {
1 Like

No. Most of this is pure ignorance :smile:

I want to keep the timestamp that is matched in the pattern. I think the other timestamp comes from the syslog request. I will try your solution.

Thank you for your correction regarding the conditional. I was not aware that it worked liked that. Consider it fixed :slight_smile:

Ok. So I tried your recommendations.
Here is the modified filter:

filter {
if [type] == "syslog" {
if [logsource] in [ "solr2","SOLR3","solr1" ] {
grok {
match => { "message" => "%{SYSLOGHOST:syslog_program} %{TIMESTAMP_ISO8601:ts} %{LOGLEVEL:loglevel} %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "ts", "yyyy-MM-dd HH:mm:ss.SSS" ]
timezone => "UTC"
}
}
}
}

And this is the output:

{
"@timestamp" => 2019-02-20T15:44:30.034Z,
"type" => "syslog",
"logsource" => "solr2",
"ts" => "2019-02-20 15:44:30.034",
"timestamp" => "Feb 20 16:44:30",
"loglevel" => "INFO",
"host" => "172.17.0.1",
"syslog_message" => " (searcherExecutor-12-thread-1) [ ] o.a.s.c.QuerySenderListener QuerySenderListener done.",
"received_at" => "2019-02-20T16:44:30.000Z",
"severity" => 5,
"facility" => 16,
"priority" => 133,
"severity_label" => "Notice",
"@version" => "1",
"syslog_program" => "SOLR",
"message" => "SOLR 2019-02-20 15:44:30.034 INFO (searcherExecutor-12-thread-1) [ ] o.a.s.c.QuerySenderListener QuerySenderListener done.",
"facility_label" => "local0",
"received_from" => "172.17.0.1"
}

Now how do I get the value of ts to be the timestamp that is used in ES?

date { match => [ "ts", "ISO8601" ] }

will parse ts into @timestamp.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.