Parse very complicated logs with logstash (braces and random keys names)

Hello, I'm trying to parse very complicated logs as below but it's very hard. Could someone help me ?

2020-05-29T16:28:49.051146+02:00 machinehostname APP_NAME[31161] @cee: {"category": "USER", "context": {"authMode": "PASSWORD", "authUserName": "user56"}, "service": {"groupName": "monitoring_area", "host": "127.0.0.1", "name": "srv5632", "port": 542, "protocol": "http"}, "severity": "INFO", "source": {"authenticationMode": "PASSWORD", "ip": "10.56.72.23", "osInfo": "", "profiles": ["USER"], "protocol": "http", "realmName": "area11", "roles": ["pp8"], "sessionId": "05be23b8-598d-45bb-9559-c7c6896b7ade", "softwareInfo": "", "type": "HB", "userName": "user1777"}, "timestamp": "2020-05-29T16:28:49.051146+02:00", "type": "USER_SERVICE_DISCONNECTION"}
2020-05-29T16:28:49.124179+02:00 machinehostname APP_NAME[31161] @cee: {"beginDate": "2020-05-29T16:28:05.654609+02:00", "category": "USER", "context": {"authMode": "PASSWORD", "authUserName": "user56"}, "duration": 43, "endDate": "2020-05-29T16:28:49.051146+02:00", "service": {"groupName": "monitoring_area", "host": "127.0.0.1", "name": "srv5632", "port": 542, "protocol": "http"}, "severity": "INFO", "source": {"authenticationMode": "PASSWORD", "ip": "10.56.72.23", "osInfo": "", "profiles": ["USER"], "protocol": "http", "realmName": "area11", "roles": ["pp8"], "sessionId": "05be23b8-598d-45bb-9559-c7c6896b7ade", "softwareInfo": "", "type": "HB", "userName": "user1777"}, "status": "OK", "timestamp": "2020-05-29T16:28:49.124179+02:00", "type": "USER_SERVICE_CONNECTION_SUMMARY"}
2020-05-29T16:33:03.521952+02:00 machinehostname APP_NAME[31432] @cee: {"category": "ADMIN", "serviceName": "srv5632", "serviceType": "http", "severity": "INFO", "source": {"authenticationMode": "PASSWORD", "ip": "10.56.72.23", "osInfo": "Ubuntu", "profiles": ["ADMINISTRATOR"], "protocol": "WEB", "realmName": "area11", "roles": ["XXX-ADMINISTRATOR-PROFILE"], "sessionId": "dd064a76-430b-4101-9d9b-71f1f3a95407", "softwareInfo": "Firefox (76.0)", "type": "HB", "userName": "adm9654"}, "timestamp": "2020-05-29T16:33:03.521952+02:00", "type": "ADMIN_SERVICES_SERVICE_READ"}
2020-05-29T16:33:04.672836+02:00 machinehostname APP_NAME[31432] @cee: {"category": "ADMIN", "modificationDiff": "", "serviceName": "srv5632", "serviceType": "http", "severity": "WARNING", "source": {"authenticationMode": "PASSWORD", "ip": "10.56.72.23", "osInfo": "Ubuntu", "profiles": ["ADMINISTRATOR"], "protocol": "WEB", "realmName": "area11", "roles": ["XXX-ADMINISTRATOR-PROFILE"], "sessionId": "dd064a76-430b-4101-9d9b-71f1f3a95407", "softwareInfo": "Firefox (76.0)", "type": "HB", "userName": "adm9654"}, "timestamp": "2020-05-29T16:33:04.672836+02:00", "type": "ADMIN_SERVICES_SERVICE_MODIFY"}

I tried things like this :

input {
    # App logs input
    syslog {
        port => 10610
        type => "syslog"
        tags => ["app_logs"]
    }
}

filter {
    if "app_logs" in [tags] {
        grok {
            match => {"message" => "%{NUMBER} <%{DATA}>%{NUMBER} %{TIMESTAMP_ISO8601:timestamp} %{HOSTNAME:machine_hostname} %{WORD:application_name} %{NUMBER:app_log_category} %{NUMBER} %{GREEDYDATA}@cee: %{GREEDYDATA:raw_data}"}
        }
        kv {
            source => "raw_data"
            value_split => ":"
            include_brackets => true
        }
    }
}

output {
    if "app_logs" in [tags] {
        stdout { codec => rubydebug }
    }
}

But the keys names are changing everytime, so I don't know how to do...
Here is what I get while debugging :

[INFO ] 2021-05-11 15:13:00.170 [Ruby-0-Thread-34: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-syslog-3.4.5/lib/logstash/inputs/syslog.rb:123] syslog - new connection {:client=>"10.78.63.40:54342"}
{
                "@version" => "1",
                 "message" => "586 <190>1 2021-05-11T17:13:00.145642+02:00 machine-hostname-55 APP_NAME 1208 300005 - @cee: {\"category\": \"ADMIN\", \"severity\": \"INFO\", \"source\": {\"authenticationMode\": \"PASSWORD\", \"ip\": \"10.55.12.2\", \"osInfo\": \"CentOS\", \"profiles\": [\"ADMINISTRATOR\"], \"protocol\": \"WEB\", \"realmName\": \"area63\", \"roles\": [\"APP-ADMINISTRATOR-PROFILE\"], \"sessionId\": \"4264f862-45d0-4072-9c90-d71def1f9b84\", \"softwareInfo\": \"Firefox (88.0)\", \"type\": \"HB\", \"userName\": \"adm1254\"}, \"timestamp\": \"2021-05-11T17:13:00.145642+02:00\", \"type\": \"ADMIN_ADMINISTRATOR_DISCONNECTION_EXPIRED_ON_INACTIVITY\"}",
                "raw_data" => "{\"category\": \"ADMIN\", \"severity\": \"INFO\", \"source\": {\"authenticationMode\": \"PASSWORD\", \"ip\": \"10.25.32.12\", \"osInfo\": \"CentOS\", \"profiles\": [\"ADMINISTRATOR\"], \"protocol\": \"WEB\", \"realmName\": \"area63\", \"roles\": [\"APP-ADMINISTRATOR-PROFILE\"], \"sessionId\": \"4264f862-45d0-4072-9c90-d71def1f9b84\", \"softwareInfo\": \"Firefox (88.0)\", \"type\": \"HB\", \"userName\": \"adm1254\"}, \"timestamp\": \"2021-05-11T17:13:00.145642+02:00\", \"type\": \"ADMIN_ADMINISTRATOR_DISCONNECTION_EXPIRED_ON_INACTIVITY\"}",
                  "\"ip\"" => "\"10.25.32.12\",",
              "\"osInfo\"" => "\"CentOS\",",
               "\"roles\"" => "[\"APP-ADMINISTRATOR-PROFILE\"],",
              "@timestamp" => 2021-05-11T15:13:00.172Z,
           "{\"category\"" => "\"ADMIN\",",
                    "host" => "10.0.12.40",
                "severity" => 0,
            "\"protocol\"" => "\"WEB\",",
          "facility_label" => "kernel",
            "\"userName\"" => "\"adm1254\"},",
           "\"sessionId\"" => "\"4264f862-45d0-4072-9c90-d71def1f9b84\",",
        "machine_hostname" => "machine-hostname-55",
        "application_name" => "APP_NAME",
    "app_log_category" => "1208",
               "timestamp" => "2021-05-11T17:13:00.145642+02:00",
            "\"severity\"" => "\"INFO\",",
              "\"source\"" => "{\"authenticationMode\":",
            "\"profiles\"" => "[\"ADMINISTRATOR\"],",
           "\"timestamp\"" => "\"2021-05-11T17:13:00.145642+02:00\",",
        "\"softwareInfo\"" => "\"Firefox",
                "\"type\"" => [
        [0] "\"HB\",",
        [1] "\"ADMIN_ADMINISTRATOR_DISCONNECTION_EXPIRED_ON_INACTIVITY\"}"
    ],
                    "tags" => [
        [0] "app_logs",
        [1] "_grokparsefailure_sysloginput"
    ],
                "facility" => 0,
                "priority" => 0,
           "\"realmName\"" => "\"area63\",",
          "severity_label" => "Emergency"
}

Could someone help me with this parsing ? Thanks !

Is this your full pipeline? Your example message has the tag _grokparsefailure_sysloginput, but this tag is not being set by the pipeline you shared.

For the log messages that you shared, your log is not complicated, you have a fixed part, which you can use grok or dissect to parse it, as you are already doing, and you have a json part, which you are storing in the raw_data field.

You are trying to parse the raw_data with the wrong filter, do not use kv, use the json filter.

Try this:

json {
    source => "raw_data"
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.