Trouble using ingest pipeline to parse two different log formats


(Sam Barham) #1

As I understand it, ingest pipelines should be able to parse two different formats of log at once, by supplying multiple formats to the grok pattern list. I've done that, but I just can't seem to get it working, even with lots of fiddling.

Two example log lines:

2016-12-02T02:14:43.094093+00:00 | <daemon.err> | localhost | f343f43a3e6c[392]: | [ time="2016-12-02T02:14:43Z" level=info msg="Stuff happening."]

2016-12-02T02:17:01.972174+00:00 | <cron.info> | localhost | CRON[13747]: | [ (root) CMD ( cd / && run-parts --report /etc/cron.hourly)]

my ingest pipeline:

PUT _ingest/pipeline/mypipeline
{
"description" : "Parse logs",
"processors" : [
{
"grok" : {
"field": "message",
"patterns" : [
"%{TIMESTAMP_ISO8601} \| <%{DATA:facility}.%{LOGLEVEL}> \| %{SYSLOGHOST:logsource} \| %{SYSLOGPROG}: \| \[[\s]time="%{TIMESTAMP_ISO8601:time}" level=%{LOGLEVEL:loglevel} msg="%{DATA:msg}"]",
"%{TIMESTAMP_ISO8601:time} \| <%{DATA:facility}.%{LOGLEVEL:loglevel}> \| %{SYSLOGHOST:logsource} \| %{SYSLOGPROG}: \| \[[\s]
%{DATA:msg}]"]
}
}
]
}

If I use that pipeline, nothing gets through. If I remove the first of the patterns, logs get through, but the first kind have a msg field of 'time="2016-12-02T02:14:43Z" level=info msg="Stuff happening."'. Obviously, I'd prefer to actually parse that out rather than just leaving it as a lump. I've tried lots of things, such as GREEDYDATA or DATA, turning it into one pattern with a big "option1|option2" section, adding a second grok processor to parse the lump etc, and nothing seems to help. Any ideas?


(Alexander Reelsen) #2

Hey,

can you try to put your data and your pipeline together into the ingest simulate API, so we can also see the output?

--Alex


(Sam Barham) #3

Thanks for the hint about ingest simulate. I managed to figure out a solution using that. In the following, I parse out either version, then further parse the one with more values within 'msg'. I've got no idea why that works by my other attempts didn't, but at least it does work

POST _ingest/pipeline/_simulate
{
"pipeline" : {
"processors" : [
{
"grok" : {
"trace_match": true,
"field": "message",
"patterns" : [
"%{GREEDYDATA:msg}"
]
}
},
{
"grok" : {
"field": "msg",
"patterns" : [
"[\s]time="%{TIMESTAMP_ISO8601:time}" level=%{LOGLEVEL:loglevel} msg="%{DATA:msg}"",
"[\s]
%{DATA:msg}"
]
}
}
]
},
"docs" : [
{
"_source": {
"message" : "(root) CMD ( cd / && run-parts --report /etc/cron.hourly)"
}
},
{
"_source": {
"message" : "time="2016-12-02T02:14:43Z" level=info msg="Stuff happening.""
}
}
]
}


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.