How do I create a filter for a custom app log with different lines?

Hello

I am knew to Logstash/Elasticsearch..

I have a log file with multiple lines within it. What's the best way to create a filter for a log similar to the example below?

2016-07-12T02:04:32.998+0100: Total time for which application threads were stopped: 0.0179270 seconds
2016-07-12T02:04:33.471+0100: [CMS-concurrent-sweep: 0.473/0.473 secs] [Times: user=0.48 sys=0.00, real=0.47 secs]
2016-07-12T02:04:33.494+0100: [CMS-concurrent-reset: 0.022/0.022 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
2016-07-12T02:04:35.494+0100: Application time: 2.4959740 seconds
2016-07-12T02:04:35.496+0100: [GC [1 CMS-initial-mark: 1143120K(1572864K)] 1144381K(2044736K), 0.0037850 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

Thanks

This is currently my conf file:

input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}

filter {
if [source] =~ "filename1.log" {
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:time}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{DATA:datacode}%{SPACE}%{JAVACLASS:javaclass}%{SPACE}%{DATA:dataoperation}%{SPACE}%{NOTSPACE}%{SPACE}%{CISCO_REASON:code}%{DATA}%{SPACE}%{GREEDYDATA:lumpdata}" }
}
date {
match => [ "time" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
else if [source] =~ "filename2.log" {
grok {
commentedline.. match => { "message" => "\A%{TIMESTAMP_ISO8601:time}\W\s%{DATA:data}\W\s%{GREEDYDATA:seconds}" }
match => { "message" => "\A%{TIMESTAMP_ISO8601}%{NOTSPACE}%{SPACE}%{SYSLOG5424SD}%{SPACE}%{GREEDYDATA}" }
}
}
}

output {
stdout { }
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}

I'd use the aggregate filter.

Hi Magnus

Many thanks - that plugin looks quite interesting.

Though in my case, the log lines aren't linked to the same task - would it still work?

Also, how would you define the grok for different lines as the log lines are very different to each other?

Thanks in advance!

Though in my case, the log lines aren't linked to the same task - would it still work?

Not sure what you mean here. Don't you want to emit a single events with all numbers? Or was your question just about how to parse these different lines and emit multiple events?

Also, how would you define the grok for different lines as the log lines are very different to each other?

A single grok filter can list multiple expressions that'll get tried in order. There's an example of this in the documentation.

Hi Magnus

Or was your question just about how to parse these different lines and emit multiple events?

That is correct, I was looking to parse these different lines and emit multiple events.

Okay, never mind the aggregate filter then.

For using multiple expressions, do you add multiple match lines under grok or do all the expressions go within one match line?

So for example, to match a single line:
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:time}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{DATA:datacode}%{SPACE}%{JAVACLASS:javaclass}%{SPACE}%{DATA:dataoperation}%{SPACE}%{NOTSPACE}%{SPACE}%{CISCO_REASON:code}%{DATA}%{SPACE}%{GREEDYDATA:lumpdata}" }
}

OR

To Match 2 different lines of a log file:
grok {
match => { "message" => "\A%{TIMESTAMP_ISO8601:time}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{DATA:datacode}%{SPACE}%{JAVACLASS:javaclass}%{SPACE}%{DATA:dataoperation}%{SPACE}%{NOTSPACE}%{SPACE}%{CISCO_REASON:code}%{DATA}%{SPACE}%{GREEDYDATA:lumpdata}" }
match => { "message" => "\A%{TIMESTAMP_ISO8601:time}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{GREEDYDATA:datacode} }
}

See the example in the documentation prefixed by "If you need to match multiple patterns against a single field, the value can be an array of patterns".

Perfect, thank you very much!