Blindly parse JSON from logs and add it to elasticsearch


I've started to read quite a few topic about it and either I don't understand it correctly or it's not possible.

I'm receiving logs formatted this way:

"[INFO]  JSEXTRA:{\"user_id\": \"90:70:65:12:56:D5\", \"subscriber\": \"localhost:8060\", \"tag\": \"sub_connected\"}"

And I'm doing this filtering:

filter {
  grok {
    match => {
      "message" => '^ \"\[%{WORD:log_level}\]  %{WORD:wone}:%{GREEDYDATA:data_json}\"'
  remove_field => [ "message" ]
  json {
     source => "data_json"
     target => "doc"
     remove_field => [ "data_json" ]

But the problem here, is that I don't know in advance what the JSON key / values will be. So I don't know how I could use the "mutate" filter.

So, what I would like to do, is to have logstash parsing the JSON and rendering it as accessible labels from kibana. Is this doable?

I've got an essentially similar set of logs, and my Logstash configuration looks like yours, and the JSON objects in the log line end up in the "target" field. So I've got target => "json" and I get json.this, json.that,json.the.other etc in Elasticsearch (and hence Kibana). In other words, it just works, by magic.

But there is a big gotcha. If you have different log lines, coded by different people, which include JSON fields with the same names but different data types, it ain't gonna work. You'll get mapping errors in the logs and missing documents in Elasticsearch.

1 Like

Thank you for this TimWard!

Nonetheless in the doc it's written that if you need the JSON to be accessible at the root of the ES entry you don't have to add much (than what I did) but nothing is added on my side so far... It's really hard to understand what's happening and why haha

The problem here is that that pattern strips the closing "} from the JSON, so it is no longer valid JSON. You should be getting a _jsonparsefailure tag if you are actually running that configuration. Try

 "message" => '^ \[%{WORD:log_level}\]  %{WORD:wone}:%{GREEDYDATA:data_json}'

Also, do not remove fields until you are certain that they look the way you want them.

#     remove_field => [ "data_json" ]

Ah, I'm so sorry for that, the log comes wrapper by quotes.
(I've edited my first post)

It's super annoying but I've to escape the last quote. And indeed you're right I get this message a lot.
Here is what a parsing returns from kibana (json view):

"message": "msg:\"[demo-xiaomi][CAM       ][2018-04-26-16:32:42.041 1901064912] [INFO]  JSEXTRA:{\\\"camera_id\\\": \\\"0\\\", \\\"user_id\\\": \\\"50:F1:4A:E4:99: F\\\", \\\"tag\\\": \\\"streaming_stopped\\\"}\" ",
"data_json": "{\\\"camera_id\\\": \\\"0\\\", \\\"user_id\\\": \\\"50:F1:4A:E4:99: F\\\", \\\"tag\\\": \\\"streaming_stopped\\\"}"

but this json isn't showing at the root of ES...

Just to make sure. If you view the input line in a text editor (not in Kibana) are the double quotes really escaped using a backslash? The fact that the backslashes are escaped in your last post suggests they are. If they are then you would need to unescape them before trying to parse the JSON.

mutate { gsub => [ "message", '\\"', '"' ] }

Actually, I changed the log by making sure the quotes were removed and now everything is working perfectly fine.

I want to deeply thanks everyone who spent time helping me :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.