We have just started using the ELK stack, so please excuse the noobiness of these questions and we have a situation where we have to use syslog to forward the logs from a SOLR server.
The format of the lines in the SOLR logfile is what appears to be syslog like.
Here is an example of a log line:
2019-02-20 11:31:10.626 INFO (commitScheduler-20-thread-1) [ ] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
But as we have to forward this using rsyslog, it becomes encapsulated in a second syslog format and when parsed by logstash syslog module it appears to take the timestamp from the transmitted syslog. The above complete logline becomes the message.
Here is the filter configuation that I have tried using:
filter {
if [type] == "syslog" {
if [logsource] in "solr2,SOLR3,solr1" {
grok {
match => { "message" => "%{SYSLOGHOST:syslog_program} %{TIMESTAMP_ISO8601} %{LOGLEVEL:loglevel} %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "timestamp", "yyyy-mm-dd HH:mm:ss.SSS" ]
timezone => "UTC"
}
}
}
}
How would you extract the data in a correct fashion?
I am currently getting _dateparsefailure. Why is that?
You say that the message that is output by the syslog input matches the line you show, but that starts with a timestamp, and your grok pattern expects the message to start with a SYSLOGHOST. Can you remove the filter and show us what a message looks like with this output?
You are discarding the timestamp that that pattern matches. Is that what you want? If you retain it using %{TIMESTAMP_ISO8601:ts} then you could parse that using this (note MM for month, not mm)
match => [ "ts", "yyyy-MM-dd HH:mm:ss.SSS" ]
Where does the timestamp field come from? You can parse that using
match => [ "timestamp", "MMM dd HH:mm:ss" ]
Note that timestamp does not contain a year, so java will guess which year you want and sometimes its guess may be wrong.
if [logsource] in "solr2,SOLR3,solr1" {
will work, but it testing whether logsource is a substring of the RHS. So if logsource is equal to "r2,SO" it will test true. I would use an array membership test
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.