Need help with parsing json fields


(Adrian Hove) #1

I have a sweet log I am trying to parse into JSON.

[2019-03-16 00:00:00] production.INFO {"timestamp":1552694400,"execution_time":0.0272369384765625}

my default logstash config

input {
  beats {
    port => 5044
  }
}
filter {
    grok {
        match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:env}\.%{DATA:severity}: %{GREEDYDATA:message}"}
    }
    json {
        source => "message"
    }
}
output {
    elasticsearch {
        hosts => [ "elasticsearch:9200" ]
    }
}

Lastly I have this in filebeats yml

processors:
 - decode_json_fields:
     fields: ["message"]
     process_array: false
     max_depth: 1
     target: ""
     overwrite_keys: false

When i reindex my logs, no json parsing is done. What am I missing.


(Felix Stürmer) #2

Hi @Adrian_Hove,

it looks like you're trying to perform the json decoding in two places.

If the log file contains plain text lines like the example you gave, the filebeat json processor would not be able to decode it (since the line doesn't solely contain just a valid json object).

The logstash configuration looks reasonable, except that it contains a : where I don't see one in the example message.

I would recommend choosing one of two options:

  • Send the data from filebeat to logstash to elasticsearch, with filebeat performing no processing and logstash performing the grok and json decoding.
  • Send the data from filebeat to elasticsearch directly and perform the grok and json decoding in an ingest pipeline.

You seem to be very close to getting the first option working. If you are comfortable configuring logstash and want to run it as a separate process, it's a reasonable choice. If you don't actually need logstash for anything else (right now and in the foreseeable future), learning how to write an ingest pipeline could save a bit of administrative overhead.


(Adrian Hove) #3

I go it to work. The problem was the return carriage in the log line.

The JSON parsing was failing.

input {
  beats {
    port => 5044
  }
}
filter {
    grok {
        match => { "message" => "(?m)\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:env}\.%{DATA:severity}: %{GREEDYDATA:messageJSON}\s*$"}
    }
    json {
        source => "messageJSON"
        target => "messageJSON"
        skip_on_invalid_json => true
    }
}
output {

    elasticsearch {
        ilm_enabled => true
        hosts => [ "elasticsearch:9200" ]
    }
    stdout {}
}

I will start looking into how to write an ingest pipeline. If you have any documentation I can look to that would be awesome.


(Felix Stürmer) #4

That's good to hear!

The filebeat documentation has a section about parsing data using ingest nodes, which shows a small example of defining a pipeline and adjusting the filebeat configuration.

The ingest node documentation itself is part of the elasticsearch documentation. By default, each elasticsearch node is also an ingest node.

An ingest pipeline consists of a sequence of processors that are very similar to logstash filter plugins. In your case the Grok and JSON processors should be of particular interest.

Let us know if you encounter any further roadblocks.


(system) closed #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.