Filter individual json fields using json filters

I'm trying to parse my message into json fields using Json filters but running into issues

Error:
Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"=>\" at line 17, column 14 (byte 387) after filter {\n if [type] == \"app-data\" {\n mutate {\n rename => [\"env\", \"Environment\" ]\n }\n filter {\n json ", :backtrace=>["/Users/metrics-poc/logstash-7.10.0/logstash-core/lib/logstash/compiler.rb:32:in compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:184:in initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in initialize'", "/Users/metrics-poc/logstash-7.10.0/logstash-core/lib/logstash/java_pipeline.rb:47:in initialize'", "/Users/metrics-poc/logstash-7.10.0/logstash-core/lib/logstash/pipeline_action/create.rb:52:in execute'", "/Users/metrics-poc/logstash-7.10.0/logstash-core/lib/logstash/agent.rb:365:in block in converge_state'"]}

Input for Logstash:
timestamp:2020-11-30T02:41:41.244Z,message:{"ID":"1394","Type":"com.5","STAGE":"preview-0","ACCOUNT_NUMBER":"12345","REGION":"US-EAST-8"}

Logstash config file

input {
  file {
        path => "/Users/metrics-poc/filebeat-output.log"
        add_field => {"env" => "prod"}
        add_field => {"Hostname" => "ELB-1"}
        start_position => "beginning"
        type => "app-data"
  }
}

filter {
  if [type] == "app-data" {
      mutate {
        rename => ["env", "Environment" ]
      }
      filter {
        json {
              source => "message"
              target => "jsoncontent" # with multiple layers structure
        }
      }
  }
}

output {
stdout { codec => rubydebug }
}

I've tried using CSV and grok failures but I thought json filter is the ideal one to filter json data. here's an sample of output using csv filter:

{
       "Hostname" => "ELB-1",
    "Environment" => "prod",
           "path" => "/Users/a664302/fidelity_projects/metrics-poc/filebeat-output.log",
           "tags" => [
        [0] "_csvparsefailure"
    ],
       "@version" => "1",
        "message" => "timestamp:2020-11-30T02:41:41.244Z,message:{\"ID\":\"1394\",\"Type\":\"com.5\",\"STAGE\":\"preview-0\",\"ACCOUNT_NUMBER\":\"12345\",\"REGION\":\"US-EAST-1\"}",
     "@timestamp" => 2020-11-30T07:41:42.484Z,
           "host" => "MACLB1781",
           "type" => "app-data"
}

Ideally what i would like to have is to split the message into individual fields(timestamp as separate field and every json key-value pair in "message" as separate field. That's all I'm trying to do

You have 2 filters. Only need 1 to start with. Might want to move the JSON part out of the conditional statement also depending on your use case.

filter {
  if [type] == "app-data" {
    mutate {
      rename => ["env", "Environment" ]
    }
    json {
      source => "message"
      target => "jsoncontent" # with multiple layers structure
    }
  }
}

I've tried

filter {
  if [type] == "app-data" {
      mutate {
        rename => ["env", "Environment" ]
      }
      json {
        source => "message"
        target => "jsoncontent" # with multiple layers structure
      }
  }
}

but i still see jsonparse failures

{
       "@version" => "1",
     "@timestamp" => 2020-11-30T14:43:32.475Z,
           "host" => "MACL781",
           "type" => "app-data",
       "Hostname" => "ELB-1",
    "Environment" => "prod",
           "tags" => [
        [0] "_jsonparsefailure"
    ],
        "message" => "timestamp:2020-11-30T02:41:41.244Z,message:{\"ID\":\"1394\",\"Type\":\"com.5\",\"STAGE\":\"preview-0\",\"ACCOUNT_NUMBER\":\"12345\",\"REGION\":\"US-EAST-1\"}",
        "path" => "/Users/metrics-poc/filebeat-output.log"
}

Try adding codec => "json" to your input?

Yup tried that and was running into the same jsonparsefailure

Your [message] field is not JSON. You will need to parse out the timestamp and message fields within [message] before trying to use a json filter on the nested message field.

1 Like

I've tried this but i'm not sure how to split timestamp and message

filter {
  if [type] == "app-data" {
    mutate {
      rename => ["env", "Environment"]
    }
    mutate {
      add_field => {
        "[@metadata][copyOfMessage]" => "%{[message]}"
      }
    }
    grok {
      break_on_match => false
      match => {
        "message" => "%{DATA:timestamp}\,%{DATA:customMessage}"
      }
    }
    json {
      source => "customMessage"
      target => "jsoncontent"
      # with multiple layers structure
    }
  }
}

which i thought should work but this is how the output looks like

   {
           "@version" => "1",
         "@timestamp" => 2020-11-30T14:43:32.475Z,
               "host" => "MACL781",
               "type" => "app-data",
           "Hostname" => "ELB-1",
        "Environment" => "prod",
               "tags" => [
            [0] "_jsonparsefailure"
        ],
  "timestamp" => "timestamp:2020-11-30T02:41:41.244Z",
  "message" => "timestamp:2020-11-30T02:41:41.244Z,message:{\"ID\":\"1394\",\"Type\":\"com.5\",\"STAGE\":\"preview-0\",\"ACCOUNT_NUMBER\":\"12345\",\"REGION\":\"US-EAST-1\"}",
  "path" => "/Users/metrics-poc/filebeat-output.log"
    }

Try

"message" => "^timestamp:%{DATA:timestamp}\,message:%{DATA:customMessage}"

yup just tried this, the timestamp value was updated but the message is still not parsing

{
       "Hostname" => "ELB-1",
           "tags" => [
        [0] "_jsonparsefailure"
    ],
     "@timestamp" => 2020-11-30T17:32:35.748Z,
       "@version" => "1",
           "type" => "app-data",
    "Environment" => "prod",
      "timestamp" => "2020-11-30T02:41:41.244Z",
        "message" => "timestamp:2020-11-30T02:41:41.244Z,message:{\"ID\":\"1394\",\"Type\":\"com.5\",\"STAGE\":\"preview-0\",\"ACCOUNT_NUMBER\":\"12345\",\"REGION\":\"US-EAST-1\"}",
           "path" => "/Users/metrics-poc/filebeat-output.log",
           "host" => "MA781"
}

The customMessage field is not present in that event, but it must have been parsed because if the source field does not exist then the json filter is a no-op (you would not get a _jsonparsefailure tag).

Where did the customMessage field go, and what error message does the json filter log?

Sorry it skipped my mind to provide this error from logstash stdout

[ 2020-11-30T12:32:35,738][ERROR][logstash.codecs.json ][main][45464e9cae547c42c85d6a1565808cb8d2122ed8dc8545f74bc92e28470678e0] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized token 'timestamp': was expecting 'null', 'true', 'false' or NaN
at [Source: (String)"timestamp:2020-11-30T12:32:34.569Z,message:{"ID":"1394","Type":"com.5","STAGE":"preview-0","ACCOUNT_NUMBER":"12345","REGION":"US-EAST-9"}"; line: 1, column: 10]>, :data=>"timestamp:2020-11-30T12:32:34.569Z,message:{"ID":"1394","Type":"com.5","STAGE":"preview-0","ACCOUNT_NUMBER":"12345","REGION":"US-EAST-9"}"}

That looks like you are setting the source of the json filter to be [message] rather than [customMessage].

this is the filter i am using where it's specified to take the source as customMessage:

filter {
  if [type] == "app-data" {
    mutate {
      rename => ["env", "Environment"]
    }
    mutate {
      add_field => {
        "[@metadata][copyOfMessage]" => "%{[message]}"
      }
    }
    grok {
      break_on_match => false
      match => {
      "message" => "^timestamp:%{DATA:timestamp}\,message:%{DATA:customMessage}"
      }
    }
    json {
      source => "customMessage"
      target => "jsoncontent"
      # with multiple layers structure
    }
  }
}

That error is from a codec, not a json filter.

Gotcha...I see what the error is

I had to remove codec => json from the input

input {
  file {
    path => "/Users/a664302/fidelity_projects/metrics-poc/filebeat-output.log"
    add_field => {
      "env" => "prod"
    }
    add_field => {
      "Hostname" => "ELB-1"
    }
    start_position => "beginning"
    type => "app-data"
  }
}

and this doesnt give me the jsonparse error which is nice but it also doesnt parse the message:

{
       "Hostname" => "ELB-1",
      "timestamp" => "2020-11-30T14:24:04.601Z",
       "@version" => "1",
     "@timestamp" => 2020-11-30T19:24:05.772Z,
           "host" => "MA781",
        "message" => "timestamp:2020-11-30T14:24:04.601Z,message:{\"ID\":\"1394\",\"Type\":\"com.5\",\"STAGE\":\"preview-0\",\"ACCOUNT_NUMBER\":\"12345\",\"REGION\":\"US-EAST-1\"}",
    "Environment" => "prod",
           "path" => "/Users/metrics-poc/filebeat-output.log",
           "type" => "app-data"
}

Notice that there is now no _jsonparsefailure tag, and also no customMessage field. I do not think your grok is working. Does removing the \ preceding the comma help?

Yep there's no parse error.The comma preceding \ didn't alter much in terms of output

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.