Multiple Nested JSON Formats parsing with Logstash

Hello everyone,

I am trying to ingest multiple logs into my logstash. The input type is a JSON (Nested Jsons), but the formats are varying. For example,

JSON 1)   1|2|3|4|{"event-ts":"07-07-2022 17:42:55.294","event-name":"nft_custom1","eventsubtype":"custom1","eventtype":"nft","source":"dc","msgBody":{"intl_pos_chnl_use_flg_old":"","intl_ecomm_chnl_use_flg_new":"","intl_tokenisation_chnl_use_flg_new":"","dom_tokenisation_chnl_lmt_old":"","intl_atm_chnl_use_flg_old":""}}

JSON 2)   1|2|3|4|{"msgName":"EVENT","msgSource":"CHNL","EventType":"FT","EventSubType":"Custom2","HostUserId":"1234","EntityType":"ACCT","EntityId":"4445293133","AddEntityType_1":"CUST","AddEntityType_2":"BRCH","AddEntityId_2":"NA","AddEntityType_3":"EPTY","AddEntityId_3":"NA","ChannelID":"0659","ChannelType":"BRANCH","ChannelDesc":"","EventTime":"05-07-2022 00:00:00","msgBody":"{'HostId':'F','acid':'123456789','accountId':'1234546789','schm_type':'CUSTOM','schm_code':'CUSTOM','place_holder':'','cust_id':'741852963','acct_name':'TESTME','acct_sol_id':'1789','acct_ownership':'C'}","_av_":"C/48.D14P1"}

The jsons itself is nested one, the nested JSON starts from the msgBody. In some jsons, msgBody is coming as a String Type (Refer JSON 2, msgBody key there are double quotes before the "{") while others are already JSON types (Refer JSON 1, msgBody key there are no double quotes before the "{").

While parsing the files containing both the formats it sometimes parsing the String datatypes and sometime parsing the JSON datatypes.

Below is the output of parsing the file, getting exception while parsing the "String Type" nested JSON while other one got parsed successfully in kibana.

[2022-09-01T12:13:59,000][WARN ][logstash.outputs.elasticsearch][main][2f7ed5f6346399cddcba3a74d84f3ab272f660c576519aae243daf20e1bd9b01] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"elk-logstash-cxmsg", :routing=>nil}, {"log"=>{"file"=>{"path"=>"/home/clari5/test.cxmsg"}}, "message"=>"1|2|3|4|{\"msgName\":\"EVENT\",\"msgSource\":\"CHNL\",\"EventType\":\"FT\",\"EventSubType\":\"Custom2\",\"HostUserId\":\"1234\",\"EntityType\":\"ACCT\",\"EntityId\":\"4445293133\",\"AddEntityType_1\":\"CUST\",\"AddEntityType_2\":\"BRCH\",\"AddEntityId_2\":\"NA\",\"AddEntityType_3\":\"EPTY\",\"AddEntityId_3\":\"NA\",\"ChannelID\":\"0659\",\"ChannelType\":\"BRANCH\",\"ChannelDesc\":\"\",\"EventTime\":\"05-07-2022 00:00:00\",\"msgBody\":\"{'HostId':'F','acid':'123456789','accountId':'1234546789','schm_type':'CUSTOM','schm_code':'CUSTOM','place_holder':'','cust_id':'741852963','acct_name':'TESTME','acct_sol_id':'1789','acct_ownership':'C'}\",\"_av_\":\"C/48.D14P1\"}", "host"=>{"name"=>"elastictool.customerxps.com"}, "@version"=>"1", "nestedJson"=>{"acid"=>"123456789", "schm_code"=>"CUSTOM", "acct_name"=>"TESTME", "acct_sol_id"=>"1789", "place_holder"=>"", "HostId"=>"F", "accountId"=>"1234546789", "cust_id"=>"741852963", "acct_ownership"=>"C", "schm_type"=>"CUSTOM"}, "type"=>"cxmsg", "@timestamp"=>2022-09-01T06:43:58.752850Z, "event"=>{"original"=>"1|2|3|4|{\"msgName\":\"EVENT\",\"msgSource\":\"CHNL\",\"EventType\":\"FT\",\"EventSubType\":\"Custom2\",\"HostUserId\":\"1234\",\"EntityType\":\"ACCT\",\"EntityId\":\"4445293133\",\"AddEntityType_1\":\"CUST\",\"AddEntityType_2\":\"BRCH\",\"AddEntityId_2\":\"NA\",\"AddEntityType_3\":\"EPTY\",\"AddEntityId_3\":\"NA\",\"ChannelID\":\"0659\",\"ChannelType\":\"BRANCH\",\"ChannelDesc\":\"\",\"EventTime\":\"05-07-2022 00:00:00\",\"msgBody\":\"{'HostId':'F','acid':'123456789','accountId':'1234546789','schm_type':'CUSTOM','schm_code':'CUSTOM','place_holder':'','cust_id':'741852963','acct_name':'TESTME','acct_sol_id':'1789','acct_ownership':'C'}\",\"_av_\":\"C/48.D14P1\"}"}, "eventJson"=>{"EventSubType"=>"Custom2", "EventTime"=>"05-07-2022 00:00:00", "AddEntityType_3"=>"EPTY", "msgBody"=>"{'HostId':'F','acid':'123456789','accountId':'1234546789','schm_type':'CUSTOM','schm_code':'CUSTOM','place_holder':'','cust_id':'741852963','acct_name':'TESTME','acct_sol_id':'1789','acct_ownership':'C'}", "msgSource"=>"CHNL", "EventType"=>"FT", "_av_"=>"C/48.D14P1", "HostUserId"=>"1234", "EntityType"=>"ACCT", "AddEntityType_1"=>"CUST", "AddEntityId_2"=>"NA", "ChannelID"=>"0659", "EntityId"=>"4445293133", "AddEntityId_3"=>"NA", "ChannelDesc"=>"", "ChannelType"=>"BRANCH", "msgName"=>"EVENT", "AddEntityType_2"=>"BRCH"}}], :response=>{"index"=>{"_index"=>"elk-logstash-cxmsg", "_id"=>"AFLK94IBBgrpeOI5y6Zw", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"can't merge a non object mapping [eventJson.msgBody] with an object mapping"}}}}

Mentioning below the filter I am applying for my files:

filter{
 if [type] == "cxmsg"{
  grok{
     match => {
       "message" => "\|(?:[^|]+\|)+%{GREEDYDATA:recvd_json}"
     }
  }

  json{
        source => "recvd_json"
        target => ["eventJson"]
        remove_field=>["recvd_json"]
  }

  mutate {
     add_field => { "subJson" => "%{[eventJson][msgBody]}" }
  }

  mutate{
     gsub => [
       'subJson', '\'','"'
      ]
  }

  json{
    source => "subJson"
    target => "[nestedJson]"
    remove_field=>["subJson"]
  }
 }
}

Kindly suggest what can be done to parse both the formats.

Thanks in advance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.