How to parse JSON in syslog_msg to fields?

@sghosh001c

Here's a working solution for a Logstash configuration that I tested on Logstash 6.3.0.

The filters in this configuration perform three steps:

  1. Grok filter extracts the JSON String, puts it in a temporary field called payload_raw
  2. Json filter parses the temporary payload_raw field, puts the parsed data in a field called "payload"
  3. Mutate filter removes the temporary payload_raw field (and other fields)

The advantage of this approach is that you don't need to know the structure of the JSON. It will parse everything for you.

Logstash configuration

input {
    tcp {
        port => 24514
    }
    udp {
        port => 24514
    }
}

filter {
    
    # Step 1. Extract the JSON String, put it in a temporary field called "payload_raw"
    # Docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
    grok {
        match => {
            "message" => [ "%{JSON:payload_raw}" ]
        }
        pattern_definitions => {
            "JSON" => "{.*$"
        }
    }
    
    # Step 2. Parse the temporary "payload_raw" field, put the parsed data in a field called "payload"
    # Docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html
    json {
        source => "payload_raw"
        target => "payload"
    }
    
    # Step 3. Remove the temporary "payload_raw" field (and other fields)
    # Docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html
    mutate {
        remove_field => [ "payload_raw", "message", "port", "timestamp" ]
    }
}

output {
    elasticsearch {
        hosts => [ "xx.xxx.xx.xx:9200" ]
        index => "syslog24514-%{+YYYY.MM.dd}"
    }
}

Example input for payload_raw

2018-11-14T12:07:54.446-05:00 [APP/PROC/WEB/0] [OUT] 2018-11-14 17:07:54.444 INFO 25 --- [nio-8080-exec-3] c.s.a.c.InventoryOrderController : {"serverName":"serverNameValue","eventComponent":"eventComponentValue","eventName":"eventNameValue","executionTime":"executionTimeValue","executedBy":"executedByValue","eventId":"eventIdValue","eventType":"eventTypeValue","serverIp":"serverIpValue","eventDetails":"eventDetailsValue"}

Example output for payload

{
  "payload" => {
    "eventComponent": "eventComponentValue",
    "eventName": "eventNameValue",
    "serverName": "serverNameValue",
    "serverIp": "serverIpValue",
    "eventId": "eventIdValue",
    "eventDetails": "eventDetailsValue",
    "executionTime": "executionTimeValue",
    "eventType": "eventTypeValue",
    "executedBy": "executedByValue"
  }
}