mmartinez
(Mario Martinez)
May 3, 2023, 12:39pm
1
Hello,
We have logs with similar but different log formats and we would like to use the pipeline dissect processor with multiple patterns to dissect the different formats. We have already tried with grok but the ingestion is massive and the performance is horrible.
This would be the perfect solution but it doesn't exist:
{
"dissect": {
"if": "ctx.tags.contains('httpd') && ctx.tags.contains('isprime') && ctx.tags.contains('access')",
"field": "message",
"patterns": [
"%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} HTTP/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\" \"%{apache2.access.xff}\" \"%{apache2.access.protocol}\"",
"%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} HTTP/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\"",
"%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} RTSP/%{apache2.access.rtsp_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\" \"%{apache2.access.xff}\" \"%{apache2.access.protocol}\""
],
"ignore_missing" : true
}
}
Does anybody know if there is a similar solution or a different one to solve this? Thanks
Mario
stephenb
(Stephen Brown)
May 3, 2023, 3:13pm
2
Hi @mmartinez
So no there is no multi-pattern today, perhaps you could open a feature request.
I think there are a couple approaches
3 Dissect processors in cascade / order with a simply if condition on the 2nd and 3rd that test for the the field exists from the previous. This will still be pretty efficient
The first and 3rds patterns appear to be the same other that 1 field and 1 constant you certainly could use a single dissect then rename the field later.
Combine 1 & 2
If I am seeing correctly 1st and 3rd could look like this
Then rename field based on value of %{apache2.protocol}
"%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} %{apache2.protocol}/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\" \"%{apache2.access.xff}\" \"%{apache2.access.protocol}\"
mmartinez
(Mario Martinez)
May 3, 2023, 3:18pm
3
Thanks Stephen!
I will work on that and let you know if I success or not
mmartinez
(Mario Martinez)
May 5, 2023, 10:36am
4
Hi @stephenb
At the end I did this but Im not sure Im following the correct order.
{
"dissect": {
"if": "ctx.tags.contains('httpd') && ctx.tags.contains('isprime') && ctx.tags.contains('access')",
"field": "message",
"pattern": "%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} %{?apache2.access.http_todelete}/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\" \"%{apache2.access.xff}\" \"%{apache2.access.protocol}\"",
"ignore_missing" : true
}
},
{
"dissect": {
"if": "ctx.tags.contains('httpd') && ctx.tags.contains('isprime') && ctx.tags.contains('access') && ! ctx.containsKey('apache2.access.protocol')",
"field": "message",
"pattern": "%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.method} %{apache2.access.url} %{?apache2.access.http_todelete}/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\"",
"ignore_missing" : true
}
},
{
"dissect": {
"if": "ctx.tags.contains('httpd') && ctx.tags.contains('isprime') && ctx.tags.contains('access') && ! ctx.containsKey('apache2.access.user_agent')",
"field": "message",
"pattern": "%{apache2.access.domain} %{apache2.access.vhost} \"%{apache2.access.filename}\" %{apache2.access.remote_ip} %{?auth} %{apache2.access.user} [%{apache2.access.timestamp}] \"%{apache2.access.url} HTTP/%{apache2.access.http_version}\" %{apache2.access.response_code} %{apache2.access.body_sent.bytes} \"%{apache2.access.referrer}\" \"%{apache2.access.user_agent}\" \"%{apache2.access.xff}\" \"%{apache2.access.protocol}\"",
"ignore_missing" : true
}
},
stephenb
(Stephen Brown)
May 5, 2023, 1:22pm
5
You just need to look at your data and put the most common pattern first, then 2nd then 3rd
Little confused why you need all 3 with... 1 and 3 the same no?
%{?apache2.access.http_todelete}
1 Like
system
(system)
Closed
June 2, 2023, 1:23pm
6
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.