Extract fields from JSON Flie to Elastic using Logstash filters

Hello,

I'm trying to extract fields from my JSON file, but I have _jsonparsefailure" error, I tried many other filters like grok Split or vk but always the same results, logstash doesn't extract the values of the fields separately.

Here is my Piepline.conf

input {	
	file {	
		type => "json"
    		path => "/var/lib/logstash/test1.json"
    		start_position => "beginning"
		sincedb_path => "/dev/null"
		# codec => json {}
 	}
} 

filter {
	json {
		source => "message"
		target => "event"
	}
	mutate {
    		gsub => ["message","\]",""]
    		gsub => ["message","\[",""]
 	 }

}


output {
	elasticsearch {
		hosts => "http://localhost:9200"
		index => "my_index"   
		document_type => "json"
		ecs_compatibility => disabled
	}
	stdout{}
}

this is my JSON File :

[{ "start" : 1619731540, "end" : 1619731550, "scen_id" : 0, "conn_id" : 2, "test_id" : 1, "direction" : "fw", "srcip" : "0.0.0.0", "dstip" : "0.0.0.0", "state" : "Atomic Delay", "samples" : 1, "mean_latency" : 261131, "mean_jitter" : 99494, "lost_pkts" : 0, "sent_pkts" : 100, "recv_pkts" : 100}
,{ "start" : 1619731540, "end" : 1619731550, "scen_id" : 0, "conn_id" : 2, "test_id" : 1, "direction" : "sw", "srcip" : "0.0.0.0", "dstip" : "0.0.0.0", "state" : "Atomic Delay", "samples" : 1, "mean_latency" : 218259, "mean_jitter" : 21353, "lost_pkts" : 0, "sent_pkts" : 100, "recv_pkts" : 100}
]

And this is how the stdout shows the result :

//{
          "path" => "/var/lib/logstash/Twamp_Results/754002845-Twamp_vProbe-0-1619731550-pathanomaly.json",
    "@timestamp" => 2021-06-01T09:09:53.436Z,
          "host" => "localhost.localdomain",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
       "message" => "{ \"start\" : 1619731540, \"end\" : 1619731550, \"scen_id\" : 0, \"conn_id\" : 2, \"test_id\" : 1, \"direction\" : \"fw\", \"srcip\" : \"10.1.254.83\", \"dstip\" : \"10.1.158.113\", \"state\" : \"Atomic Delay\", \"samples\" : 1, \"mean_latency\" : 261131, \"mean_jitter\" : 99494, \"lost_pkts\" : 0, \"sent_pkts\" : 100, \"recv_pkts\" : 100}",
          "type" => "json",
      "@version" => "1"
}
{
          "path" => "/var/lib/logstash/Twamp_Results/754002845-Twamp_vProbe-0-1619731550-pathanomaly.json",
    "@timestamp" => 2021-06-01T09:09:53.477Z,
          "host" => "localhost.localdomain",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
       "message" => ",{ \"start\" : 1619731540, \"end\" : 1619731550, \"scen_id\" : 0, \"conn_id\" : 2, \"test_id\" : 1, \"direction\" : \"sw\", \"srcip\" : \"10.1.158.113\", \"dstip\" : \"10.1.254.83\", \"state\" : \"Atomic Delay\", \"samples\" : 1, \"mean_latency\" : 218259, \"mean_jitter\" : 21353, \"lost_pkts\" : 0, \"sent_pkts\" : 100, \"recv_pkts\" : 100}",
          "type" => "json",
      "@version" => "1"
}
{
          "path" => "/var/lib/logstash/Twamp_Results/754002845-Twamp_vProbe-0-1619731550-pathanomaly.json",
    "@timestamp" => 2021-06-01T09:09:53.481Z,
          "host" => "localhost.localdomain",
          "tags" => [
        [0] "_jsonparsefailure"
    ],
       "message" => "",
          "type" => "json",
      "@version" => "1"
}

Could someone help me with that, please?
Thank you in advance

Please edit your post, select the configuration, and click on </> in the toolbar above the edit pane. Check the review pane on the right and make sure the format changes from

input {
file {
type => "json"

to

input {
    file {
        type => "json"
        ...

Then do the same for the input, and the same for the output.

it's done, Thank you :wink:
But do you have any idea of how can I fix the problem with my extraction?

Your JSON file contains three lines, none of which are valid JSON. Each of the first two lines is almost valid JSON. There are two approaches you could take, one is to fix up the lines you have using mutate before you try to parse it

mutate {
    gsub => [
        "message","\]","",
        "message","\[","",
        "message", "^,", ""
    ]
}
if [message] =~ "^$" { drop {} }
json {
	source => "message"
	target => "event"
}

The other is to parse the entire file as an array of JSON objects, then use a split filter to separate them. See here for more information.

Thank you very much @Badger

this work for my basic JSON file, but I have another invalid format of the file

in the case where I have an array into my array message, how can I deal with that?

thank you in advance for your help

[{ "start" : 1619724650, "end" : 1619724660, "scen_id" : 0, "conn_id" : 2, "test_id" : 1, "synchro" : 0, "sampling_period" : 10, "data" : [192000,506100,841000,358000,496500,757000,120515,-614001,100404,522000,-200000,-3001,240999,177302,-261000,97696,388000,-139000,45000,304000,128776,18000,285090,580000,134000,282000,539000,112113,-331000,89402,477000,-205999,0,229000,150250,-187000,104173,375000,-71000,77000,334000,114772,157000,223859,316000,182000,218500,288001,31902,-87001,32858,123000,-67000,-5000,70000,53484,-91000,34949,68000,-68000,-30000,40001,67398,0,0,0,0,0,0,0,0,0,0,100,100,1,0,47000,174119,610000],
"time_offset" : [0,0,100,-54523,-6833,26304,0,0,100,-102372,51502,534622]
,"ftl" : [{"val": 255,"int": 1619724650,"frac": 60069996}]
,"wtl" : [{"val": 255,"int": 1619724650,"frac": 60622996}]}
]

Consume the file as a single event. Read the post I linked to.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.