Still facing an issue with Json multiline

Blason · September 30, 2023, 9:16am

Hi Team,

I am still having a difficulty parse json multiline and not getting any clue about it. Can someone please help with it?
Here are the original message

[
    {
        "post_title": "Windemuller",
        "group_name": "lorenz",
        "discovered": "2020-01-12 00:00:00.000000"
    },
    {
        "post_title": "Leaks Company Birch Communications inc.",
        "group_name": "ragnarlocker",
        "discovered": "2020-06-10 00:00:00.000000"
    },
    {
        "post_title": "Brunner Announce – Hello World !",
        "group_name": "ragnarlocker",
        "discovered": "2020-06-11 00:00:00.000000"
    },
{
        "post_title": "INC RANSOMWARE...",
        "group_name": "donutleaks",
        "discovered": "2023-09-30 04:27:49.408003"
    }
]

And here is my codec

input {
#       file {
#       path => ["/var/log/ran.json"]
#       tags => "ransomware"
#       start_position => "beginning"
        stdin {
        codec => json { target => "[document]" }
#       codec => multiline {
#               pattern => "^{"
#               what => "previous"
#               }
        }
        }
#filter {
#       if [message] =~ /^{.*}$/ {
#       json {
#               source => "message"
#               target => "parsed_json"
##              remove_field => ["message"]
#                       }
#                      }
#}
output {
        stdout { codec => rubydebug }
}

Rios · September 30, 2023, 6:15pm

input {
  file {
   path => [ "/path/file.json" ]
   start_position => beginning
   sincedb_path => "/dev/null"
   codec => multiline
      {
            pattern => '^\s*{'
            negate => true
            what => previous
			auto_flush_interval => 1
			multiline_tag => ""
      }
  }            
} 
filter {
    mutate
    {
        gsub => [ 'message',"\s*{\r\n\s*",'{']
        gsub => [ 'message',",\r\n\s*",',']
        gsub => [ 'message',"\r\n\s*},\r",'}']
        gsub => [ 'message',"\r\n\s*}\s*]\r",'}']
    }
	
    if [message] =~ /^\[|\]/ {
      drop {}
    }	 
    else {
        json { source => "message" }
    }
	date { 
		match => ["discovered", "yyyy-MM-dd HH:mm:ss.SSSSSS"]
		#timezone => "Asia/Dubai"
		target=> "discovered"
    }
	
}
output {
   stdout { }
}

Result:

{
       "message" => "{\"post_title\": \"Windemuller\",\"group_name\": \"lorenz\",\"discovered\": \"2020-01-12 00:00:00.000000\"}",
    "post_title" => "Windemuller",
    "@timestamp" => 2023-09-30T18:11:06.538084300Z,
      "@version" => "1",
    "group_name" => "lorenz",
    "discovered" => 2020-01-11T23:00:00.000Z
}
{
       "message" => "{\"post_title\": \"Leaks Company Birch Communications inc.\",\"group_name\": \"ragnarlocker\",\"discovered\": \"2020-06-10 00:00:00.000000\"}",
    "post_title" => "Leaks Company Birch Communications inc.",
    "@timestamp" => 2023-09-30T18:11:06.539095700Z,
      "@version" => "1",
    "group_name" => "ragnarlocker",
    "discovered" => 2020-06-09T22:00:00.000Z
}
{
       "message" => "{\"post_title\": \"Brunner Announce ? Hello World !\",\"group_name\": \"ragnarlocker\",\"discovered\": \"2020-06-11 00:00:00.000000\"}",
    "post_title" => "Brunner Announce ? Hello World !",
    "@timestamp" => 2023-09-30T18:11:06.540077Z,
      "@version" => "1",
    "group_name" => "ragnarlocker",
    "discovered" => 2020-06-10T22:00:00.000Z
}
{
       "message" => "{\"post_title\": \"INC RANSOMWARE...\",\"group_name\": \"donutleaks\",\"discovered\": \"2023-09-30 04:27:49.408003\"}",
    "post_title" => "INC RANSOMWARE...",
    "@timestamp" => 2023-09-30T18:11:08.028026200Z,
      "@version" => "1",
    "group_name" => "donutleaks",
    "discovered" => 2023-09-30T02:27:49.408Z
}

Blason · October 1, 2023, 7:04am

Rios:

input {
  file {
   path => [ "/path/file.json" ]
   start_position => beginning
   sincedb_path => "/dev/null"
   codec => multiline
      {
            pattern => '^\s*{'
            negate => true
            what => previous
			auto_flush_interval => 1
			multiline_tag => ""
      }
  }            
} 
filter {
    mutate
    {
        gsub => [ 'message',"\s*{\r\n\s*",'{']
        gsub => [ 'message',",\r\n\s*",',']
        gsub => [ 'message',"\r\n\s*},\r",'}']
        gsub => [ 'message',"\r\n\s*}\s*]\r",'}']
    }
	
    if [message] =~ /^\[|\]/ {
      drop {}
    }	 
    else {
        json { source => "message" }
    }
	date { 
		match => ["discovered", "yyyy-MM-dd HH:mm:ss.SSSSSS"]
		#timezone => "Asia/Dubai"
		target=> "discovered"
    }
	
}
output {
   stdout { }
}

Hmmm - Certain are getting parsed while certain are not

{
          "host" => "parsers",
      "@version" => "1",
    "@timestamp" => 2023-10-01T07:03:11.509Z,
       "message" => "    {\n        \"post_title\": \"palaciodosleiloes.com.br\",\n        \"group_name\": \"lockbit3\",\n        \"discovered\": \"2023-09-29 22:31:29.689195\"\n    },",
          "path" => "/var/log/ran.json",
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}
{
          "host" => "parsers",
      "@version" => "1",
    "@timestamp" => 2023-10-01T07:03:11.509Z,
       "message" => "    {\n        \"post_title\": \"mclaren health care\",\n        \"group_name\": \"alphv\",\n        \"discovered\": \"2023-09-29 23:29:04.495330\"\n    },",
          "path" => "/var/log/ran.json",
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}
{
          "host" => "parsers",
      "@version" => "1",
    "@timestamp" => 2023-10-01T07:03:11.509Z,
       "message" => "    {\n        \"post_title\": \"MNGI Digestive Health (TIME IS UP)\",\n        \"group_name\": \"alphv\",\n        \"discovered\": \"2023-09-30 02:29:31.640284\"\n    },",
          "path" => "/var/log/ran.json",
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}
{
          "host" => "parsers",
      "@version" => "1",
    "@timestamp" => 2023-10-01T07:03:12.742Z,
    "discovered" => 2023-09-29T22:57:49.408Z,
       "message" => "    {\n        \"post_title\": \"INC RANSOMWARE...\",\n        \"group_name\": \"donutleaks\",\n        \"discovered\": \"2023-09-30 04:27:49.408003\"\n    }",
          "path" => "/var/log/ran.json",
    "post_title" => "INC RANSOMWARE...",
    "group_name" => "donutleaks"
}

Rios · October 1, 2023, 7:45am

Well, you put a sample which has been parsed correctly. For other cases, use gub a little bit more.

Badger · October 1, 2023, 7:37pm

Your "JSON" has a trailing comma. As Rios says, you can update your mutate to fix that.

system · October 29, 2023, 7:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Json multiline codec is not working and messages are not getting parsed Logstash	17	276	October 25, 2023
Multiline json over TCP is not parsed Logstash	8	486	October 22, 2020
How to parse the multiline json file through logstash Logstash	7	18337	July 6, 2017
Unable to parse multiline json data into logstash Logstash	12	983	October 22, 2021
[solved] Logstash-Forwarder, json codec and multiline issue Logstash	5	1683	July 6, 2017

Still facing an issue with Json multiline

Related topics