Advice on parsing a JSON log

Hi,

We are currently experience some issues parsing our JSON logs from filebeat to elasticsearch.

An example of a log being sent is:

"{"log":"{\"level\":30,\"time\":1570692192768,\"msg\":\"APP_STARTED\",\"pid\":1,\"hostname\":\"platform-onprem-hybrid-connector-5cf44bdc98-9zjh8\",\"name\":\"hg-platform-onprem-hybrid-connector\",\"port\":8080,\"v\":1}\n","stream":"stdout","time":"2019-10-10T07:23:12.769273893Z"}"

The filebeat.yml config is:

- type: log
  enabled: true
  close_inactive: 11m
  ignore_older: 48h
  clean_inactive: 72h
  paths:
    - /var/log/pods/dev*/*/*.log
  tail_files: true
  symlinks: true
  fields: {log_type: application_output}
output.logstash:
  hosts: ["XX.XX.XX.XX"]
  ssl.certificate_authorities: ["/usr/share/filebeat/logstash-remote.crt"]
  ssl.certificate: "/usr/share/filebeat/logstash-remote.crt"
  ssl.key: "/usr/share/filebeat/logstash-remote.key"
  client_authentication: none

logging.metrics.enabled: false
logging.selectors: ["*"]
logging.json: true
logging.level: debug

We have the below config currently in our POC environment:

  beats {
    port => 5044
    ssl => true
    ssl_certificate_authorities => ["/etc/pki/certs/logstash-remote.crt"]
    ssl_certificate => "/etc/pki/certs/logstash-remote.crt"
    ssl_key => "/etc/pki/certs/logstash-remote.key"
    ssl_verify_mode => "force_peer"
    }
}

  filter {

  if([fields][log_type] == "application_output") {
    # if the message actually is JSON
    if [message] =~ "^\{.*\}[\s\S]*$" {
      mutate { add_field => { "log.type" => "Application: JSON" } }

      json {
        id => "jsonfilter"
        source => "log"
        # remove some irrelevant fields
        remove_field => ["_sourceUri", "_user", "sourceUri", "user", "pid", "v"]
      }

      # unix epoch timestamp from our application output
      date {
        match => [ "time", "UNIX_MS" ]
        remove_field => ["time"]
      }

      mutate {
        rename => ["message", "app.rawOutput"]
      }
    }
  }
}
output {
  elasticsearch {
    hosts => ["localhost:9200"]
    }
}

Any help is appreciated.

Thanks,
Callum

bump

What issues?

Hi @Badger,

The issue we are encountering is that the log itself isn't being separated into the individual segments such as:
level: 30
time: 1570692192768
msg: APP_STARTED
pid: 1
hostname: platform-onprem-hybrid-connector-5cf44bdc98-9zjh8
name: hg-platform-onprem-hybrid-connector
port: 8080
v: 1
stream: stdout
time: 2019-10-10T07:23:12.769273893Z

It simply processes the rest of the filters and displays in logstash as per the below:

Any ideas?

It looks like you have a field call app.rawOutput that you should be passing to a json filter, and possibly calling a second json filter to parse the log field within that.