Having issues with a specific "message" field while parsing using Filebeat and Logstash

I am trying to push logs in json format to logstash through filebeat. My logs are decoded/parsed correctly except for only one field with name "message". When I rename the field, it is being parsed correctly. There is issue only when the field name is "message"

Below is my log format:

`{"type":"cloud_monitor","format":"default","version":"1.0","id":"ceda60685a1fba7512e9eb4","start":"1512028789.984","cp":"532198","message":{"proto":"https","protoVer":"1.1","status":"200","cliIP":"********","reqPort":"443","reqHost":"*******","reqMethod":"POST","reqPath":"%2fsolr%2fcontent_Publish%2fupdate","reqQuery":"wt%3djavabin%26version%3d2","reqCT":"application%2fxml%3b%20charset%3dUTF-8","sslVer":"TLSv1.2","respCT":"application/octet-stream","respLen":"44","bytes":"44","UA":"Solr%5borg.apache.solr.client.solrj.impl.HttpSolrServer%5d%201.0","fwdHost":"********"},"reqHdr":{"conn":"Keep-Alive",},"netPerf":{"downloadTime":"31","lastMileRTT":"4","midMileRTT":"8","midMileLatency":"6","netOriginLatency":"17","cacheStatus":"0","firstByte":"1","lastByte":"1","asnum":"14618","edgeIP":"*******"},"geo":{"country":"US","region":"VA","city":"ASHBURN","lat":"39.0438","long":"-77.4879"}}` 

The logs in logstash looks like:

My filebeat config is as follows:

- type: log

  # Change to true to enable this prospector configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
  - /data2/testlogs/*.log
  tags: ["json"]
  json.keys_under_root: true
  json.add_error_key: true

And my Logstash config is as follows:

input {
  beats {
    port => 5044
    codec => "json"
  }
}

filter {
  json {
    source => "source_input"
  }
  mutate {
   remove_field => [ "remote_user" , "[reqHdr][cookie]" ]
 }
}

output {
       elasticsearch {
                        hosts => "localhost:9200"
                }
}

Kindly help me as to why I am having issues in parsing json file only for one field "message". If I rename the field in the source file, the parsing is done correctly without issues. I also tried renaming in the logstash field, that was of no help.

Kindly help in identifying the issue

@Deepak_Poola,

Could you post some informations?

  • What version of Logstash?
  • What version of Filebeat?

What's happen when you use "message" field?

In your filter block on json plugin you use "source_input" but I don't see this field in your log exemple. Where you defined this field?

In general, We used the message field with json as input like [source => "message"]. See in this document Source - Logstash Filter.

Other point about your json log, there is an invalid format on your Json in this section:
"reqHdr":{
"conn":"Keep-Alive",

}

There is a comma after the single value from reqHdr. I don't know if there might be a problem when trying to validate this json message.

@camarar, thanks for your response, please find my comments below:

The version of Logstash is logstash 6.2.2
The version of Filebeat is 6.2.2

I have pasted in the image below, as to how the message field is reflected in the elastic search

I have tried [source => "message"] and others, but the output is the same.

The last point kindly ignore "reqHdr", there was cookie field which was huge, I removed it for readability in the post. The format of JSON is perfect, I have checked that.

Could you please paste json tab data?

@Suman_Reddy1

Please find the jason tab data below:

Picture2

I am on mobile device. What I understand fro message tab, that message is not looking like a proper json.
There are ‘=>’ extra. which suggests that and I think you already parsed message and it converted into hash. Please do one thing, post your output json without applyig any of the logstash filters. Just a simple input and output to console. Hope this makes sense

@Suman_Reddy1

the "message" field in the raw log file is in proper format, because if I replace message with another name in the source file, logstash is parsing correctly. There is an issue with the word "message", I do not know why.

The extra '=>' is being added post parsing / processing, it is not in the source file.

Its a stupid question to ask.. but can I have exact log statement. Ill try to replicate the issue.

@Suman_Reddy1

I hv given the log format in the first post. Please find it pasted there.

@Deepak_Poola as pointed out previously, the log message in the first post is not valid json, which makes it hard to reproduce your scenario.


The => that is showing up in your Elasticsearch is perhaps a ruby-ism, and while it isn't a desired form of output, its presence indicates a couple things to me:

  • Logstash at some point understood the structure of what it was given, and created an in-memory structured mapping keys to values as intended
  • At some point, this structured output was flattenend.

At present, I am not entirely sure what is "flattening" your message, but there are some additional things to consider:


The top-level message field can be considered semi-reserved because of some interesting properties of Elasticsearch (namely: fields cannot have mixed types) and Logstash (namely: in codec parsing-failure scenarios message will be output as a string).

Hopefully the following will provide some context:

Elasticsearch will make an attempt to coerce documents that it receives into the target index's field mappings, and will ultimately reject the query if the document cannot be coerced. When it encounters a new field that it hasn't seen before, it will make a best guess at the field's mapping and apply it to the whole index. If a field oscillates between types, one of those types will win, and receipt of the other will either cause flat-out rejections or potentially-lossy coercion of the underlying data.

In many cases, when a Logstash codec fails to parse an input, it will put the input body into the message field as a string, and add an appropriate tag to the event so that downstream filters and outputs can perform logic based on the success/failure.

Hi @yaauie,

Please find the log format below:

{"type":"cloud_monitor","format":"default","version":"1.0","id":"ceda60685a1fba7512e9eb4","start":"1512028789.984","cp":"532198","message":{"proto":"https","protoVer":"1.1","status":"200","cliIP":"*******","reqPort":"443","reqHost":"*******","reqMethod":"POST","reqPath":"%2fsolr%2fcontent_Publish%2fupdate","reqQuery":"wt%3djavabin%26version%3d2","reqCT":"application%2fxml%3b%20charset%3dUTF-8","sslVer":"TLSv1.2","respCT":"application/octet-stream","respLen":"44","bytes":"44","UA":"Solr%5borg.apache.solr.client.solrj.impl.HttpSolrServer%5d%201.0","fwdHost":"************"},"reqHdr":{"conn":"Keep-Alive"},"respHdr":{"conn":"keep-alive","date":"Thu,%2030%20Nov%202017%2007:59:50%20GMT"},"netPerf":{"downloadTime":"31","lastMileRTT":"4","midMileRTT":"8","midMileLatency":"6","netOriginLatency":"17","cacheStatus":"0","firstByte":"1","lastByte":"1","asnum":"14618","edgeIP":"********"},"geo":{"country":"US","region":"VA","city":"ASHBURN","lat":"39.0438","long":"-77.4879"}}

I understand your comments, the strange thing is when I replace the "message" verbiage to anything else, the logstash is able to parse successfully. Only with the name of the field being message is an issue.

Is there a way to resolve this.

I am trying to understand the issue.
I have used your log message and applied json filter on it.

json {
        source => "[message]"
        target => "docs"
}

If you want to parse message and assign to message again

json {
        source => "[message]"
        target => "[message]"
}

Message is incoming log.
Which is parsed as below.

"docs" => {
        "version" => "1.0",
         "format" => "default",
         "reqHdr" => {
            "conn" => "Keep-Alive"
        },
          "start" => "1512028789.984",
        "respHdr" => {
            "conn" => "keep-alive",
            "date" => "Thu,%2030%20Nov%202017%2007:59:50%20GMT"
        },
             "id" => "ceda60685a1fba7512e9eb4",
        "message" => {
             "reqQuery" => "wt%3djavabin%26version%3d2",
                   "UA" => "Solr%5borg.apache.solr.client.solrj.impl.HttpSolrServer%5d%201.0",
                "cliIP" => "*******",
               "sslVer" => "TLSv1.2",
              "respLen" => "44",
              "fwdHost" => "************",
                "reqCT" => "application%2fxml%3b%20charset%3dUTF-8",
              "reqPort" => "443",
              "reqHost" => "*******",
            "reqMethod" => "POST",
                "proto" => "https",
             "protoVer" => "1.1",
              "reqPath" => "%2fsolr%2fcontent_Publish%2fupdate",
               "status" => "200",
                "bytes" => "44",
               "respCT" => "application/octet-stream"
        },
        "netPerf" => {
                 "cacheStatus" => "0",
                    "lastByte" => "1",
            "netOriginLatency" => "17",
                      "edgeIP" => "********",
                "downloadTime" => "31",
                  "midMileRTT" => "8",
              "midMileLatency" => "6",
                 "lastMileRTT" => "4",
                       "asnum" => "14618",
                   "firstByte" => "1"
        },
             "cp" => "532198",
           "type" => "cloud_monitor",
            "geo" => {
               "city" => "ASHBURN",
            "country" => "US",
                "lat" => "39.0438",
             "region" => "VA",
               "long" => "-77.4879"
        }
    }

Input message

{"type":"cloud_monitor","format":"default","version":"1.0","id":"ceda60685a1fba7512e9eb4","start":"1512028789.984","cp":"532198","message":{"proto":"https","protoVer":"1.1","status":"200","cliIP":"*******","reqPort":"443","reqHost":"*******","reqMethod":"POST","reqPath":"%2fsolr%2fcontent_Publish%2fupdate","reqQuery":"wt%3djavabin%26version%3d2","reqCT":"application%2fxml%3b%20charset%3dUTF-8","sslVer":"TLSv1.2","respCT":"application/octet-stream","respLen":"44","bytes":"44","UA":"Solr%5borg.apache.solr.client.solrj.impl.HttpSolrServer%5d%201.0","fwdHost":"************"},"reqHdr":{"conn":"Keep-Alive"},"respHdr":{"conn":"keep-alive","date":"Thu,%2030%20Nov%202017%2007:59:50%20GMT"},"netPerf":{"downloadTime":"31","lastMileRTT":"4","midMileRTT":"8","midMileLatency":"6","netOriginLatency":"17","cacheStatus":"0","firstByte":"1","lastByte":"1","asnum":"14618","edgeIP":"********"},"geo":{"country":"US","region":"VA","city":"ASHBURN","lat":"39.0438","long":"-77.4879"}}

Let us know if you need any specific scenario.!

@Suman_Reddy1

I tried with by adding the target both ways as you suggested, Still it is not parsing the "message" field.

Can you please paste the filebeat and logstash Configurations.

Thanks

If it is not parsing what error you are getting. I have nly logstash.. there is no filebeat in my case. That should not be an issue.

@Suman_Reddy1

I feel filebeat could also be an issue. I am not getting an error, but my output has not changed it is coming as before, as shown in the above posts.

Looking at your previous screenshots, I can say that your message is getting parsed (thats why we are seeing these symbols ‘=>’) and converted into an hash. If your input log message itself is json, then dont try to parse it. Put it straightly into Elasitcsearch or Kibana. Give that a try.

Please see my above comments again, specifically:

  • the => is proof that at some point Logstash parsed the message, and then something flattened it (not sure if it was Logstash or Elasticsearch).
  • the message field is semi-reserved and may be being coerced to String on your behalf because the Elasticsearch index expects a string at that address (perhaps because it has previously seen a string at that address).

@yaauie

Thanks for your analysis, how do I tell Logstash not to expect a string and force to parse it as a json

In software, "reserved" means that a name either must not be used, or can only be used in very specific ways. My messages above about message being semi-reserved indicate that in this context due to the explicit ways in which Logstash and Elasticsearch work together, you cannot reliably use the field "message" as anything other than a string.

You can rename the field within Logstash after ensuring the event was correctly parsed, so that your logs can be emitted unchanged, but when it gets exported to Elasticsearch it is no longer into the reserved message field, but into some other field (e.g., msg):

filter {
  if "_jsonparsefailure" not in [tags] {
    mutate {
      rename { "message" => "msg" }
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.