Hello,
I need to create a grok into Elasticsearch/Kibana/Filebeat to parse my message field. I find myself with two issues :
Here is my pipeline :
PUT /_ingest/pipeline/filebeat-7.6.2-system-syslog-pipeline
{
"description": "Pipeline for parsing Syslog messages.",
"processors": [
{
"grok": {
"pattern_definitions": {
"GREEDYMULTILINE": "(.|)*"
},
"ignore_missing": true,
"field": "message",
"patterns": [
"%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\) %{GREEDYMULTILINE:system.syslog.message}"
"%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pi:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}",
"%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{GREEDYMULTILINE:system.syslog.message}",
"%{TIMESTAMP_ISO8601:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pd:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}"
"%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname}_%{DATA:host.re} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\) %{GREEDYMULTILINE:system.syslog.message}"
"%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\): %{GREEDYMULTILINE:system.syslog.message}"
]
}
},
{
"rename": {
"field": "system.syslog.message",
"target_field": "message",
"ignore_missing": true
}
},
{
"remove": {
"field": "message"
}
},
{
"date": {
"formats": [
"MMM d HH:mm:ss",
"MMM dd HH:mm:ss",
"MMM d HH:mm:ss",
"ISO8601"
],
"on_failure": [
{
"append": {
"field": "error.message",
"value": "{{ _ingest.on_failure_message }}"
}
}
],
"if": "ctx?.event?.timezone == null",
"field": "system.syslog.timestamp",
"target_field": "@timestamp"
}
},
{
"date": {
"if": "ctx?.event?.timezone != null",
"field": "system.syslog.timestamp",
"target_field": "@timestamp",
"formats": [
"MMM d HH:mm:ss",
"MMM dd HH:mm:ss",
"MMM d HH:mm:ss",
"ISO8601"
],
"timezone": "{{ event.timezone }}",
"on_failure": [
{
"append": {
"field": "error.message",
"value": "{{ _ingest.on_failure_message }}"
}
}
]
}
},
{
"remove": {
"field": "system.syslog.timestamp"
}
}
],
"on_failure": [
{
"set": {
"field": "error.message",
"value": "{{ _ingest.on_failure_message }}"
}
}
]
}
- The first one is that I want to parse messages of the type (but not only that) :
PS : I replace the actual values by X for confidentiality.
"Feb 7 00:23:34 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from EstabSync to Established (event RsyncAck) (instance master)"
I tested it with the _simulate
option, which gave me what I wanted but on the Kibana DIscover, it returns me Provided Grok expressions do not match field value: [Feb 14 14:55:11 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (Internal AS x) changed state from Established to Idle (event RecvNotify) (instance master)]
. I don't understand why.
- The second one is that I would like to parse all of these messages :
"Feb 14 10:13:32 ig1-ar-dc3-02 fpc0 DCBCM [ge-0/0/26]: phy layer link status [2nd try mismatch] is FALSE"
"Feb 9 00:57:14 ig1-edge-th2-01 rpd[9169]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from OpenConfirm to Established (event RecvKeepAlive) (instance master)"
"Feb 7 00:23:34 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from EstabSync to Established (event RsyncAck) (instance master)"
"Feb 10 22:40:26 ig1-ar-dc3-01 rpd[2121]: BGP_IO_ERROR_CLOSE_SESSION: BGP peer x.x.x.x (External AS x): Error event Operation timed out(60) for I/O session - closing it"
but it only apply the first pattern one the grok, which is actually "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pi:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}"
in my pipeline.
Were the other ones ignore or there is a limit to the number of patterns ?
Thanks for you time.