Advice on parsing a JSON log


We are currently experience some issues parsing our JSON logs from filebeat to elasticsearch.

An example of a log being sent is:


The filebeat.yml config is:

- type: log
  enabled: true
  close_inactive: 11m
  ignore_older: 48h
  clean_inactive: 72h
    - /var/log/pods/dev*/*/*.log
  tail_files: true
  symlinks: true
  fields: {log_type: application_output}
  hosts: ["XX.XX.XX.XX"]
  ssl.certificate_authorities: ["/usr/share/filebeat/logstash-remote.crt"]
  ssl.certificate: "/usr/share/filebeat/logstash-remote.crt"
  ssl.key: "/usr/share/filebeat/logstash-remote.key"
  client_authentication: none

logging.metrics.enabled: false
logging.selectors: ["*"]
logging.json: true
logging.level: debug

We have the below config currently in our POC environment:

  beats {
    port => 5044
    ssl => true
    ssl_certificate_authorities => ["/etc/pki/certs/logstash-remote.crt"]
    ssl_certificate => "/etc/pki/certs/logstash-remote.crt"
    ssl_key => "/etc/pki/certs/logstash-remote.key"
    ssl_verify_mode => "force_peer"

  filter {

  if([fields][log_type] == "application_output") {
    # if the message actually is JSON
    if [message] =~ "^\{.*\}[\s\S]*$" {
      mutate { add_field => { "log.type" => "Application: JSON" } }

      json {
        id => "jsonfilter"
        source => "log"
        # remove some irrelevant fields
        remove_field => ["_sourceUri", "_user", "sourceUri", "user", "pid", "v"]

      # unix epoch timestamp from our application output
      date {
        match => [ "time", "UNIX_MS" ]
        remove_field => ["time"]

      mutate {
        rename => ["message", "app.rawOutput"]
output {
  elasticsearch {
    hosts => ["localhost:9200"]

Any help is appreciated.



What issues?

Hi @Badger,

The issue we are encountering is that the log itself isn't being separated into the individual segments such as:
level: 30
time: 1570692192768
pid: 1
hostname: platform-onprem-hybrid-connector-5cf44bdc98-9zjh8
name: hg-platform-onprem-hybrid-connector
port: 8080
v: 1
stream: stdout
time: 2019-10-10T07:23:12.769273893Z

It simply processes the rest of the filters and displays in logstash as per the below:

Any ideas?

It looks like you have a field call app.rawOutput that you should be passing to a json filter, and possibly calling a second json filter to parse the log field within that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.