Logstash Filter Json To Array of Json

(Vineethpp) #1

Hi,
I was trying get a logstash filter which will write json object to kafka as array of json object.
As of now with out any filter its writing the json object, but for me i want the kafka events as array of json objects (so that consumer understands it and reads).
this what it is happening now
{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.1","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}

But i need it to be with in an array
[{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.0","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}]

My filter

input {
http {
port => 9200
host => "0.0.0.0"
}
}
filter {
mutate {
remove_field => [ "headers", "@timestamp", "host", "@version"]

}

}
output {
kafka {
codec => json{}
bootstrap_servers => "kafka-service.test.com:9092"
topic_id => "test"
}
}

Help me if anyone has any approach.

#2

I am not understanding what this represents. Can you configure

output { stdout { codec = > rubydebug } }

and show us what an event looks like?

(Vineethpp) #3

We have an application which sends the data over http (body is json object ) like below.
I have written the logstash config to listen fof this http request, accept it , one filter is put there to remove the fields which is added by the input http plugin. Then output the json object to kafka topic. Consumers will read it from kafka topic, the problem here is the consumers are reading only array of json object, not the json object as is.. With the current logstash config, am seeing the messages/events in kafka topic is (the same as the application sends, this is trace data generated by zipkin instrumentation in that application)

{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.1","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}

I want the message to be enclosed in square brackets in kafka topic, so that consumer can understand and consume it.

like this

[{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.1","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}]

Am looking for a filter to add square bracket to the entire json object which comes out of the http input plugin and then write to kafka.

(Vineethpp) #4

I did output { stdout { codec = > rubydebug } }, the output looks like below, i need to add square brackets around it , i mean need to put the json object into an array of json object.

{
"timestamp" => 1557911261541631,
"id" => "97e9096d50474d6a",
"kind" => "SERVER",
"tags" => {
"mvc.controller.method" => "callService",
"mvc.controller.class" => "TracingController",
"http.path" => "/serviceA",
"http.method" => "GET"
},
"duration" => 17579,
"name" => "get",
"traceId" => "97e9096d50474d6a",
"localEndpoint" => {
"serviceName" => "sample",
"ipv4" => "127.0.0.1"
},
"remoteEndpoint" => {
"port" => 56208,
"ipv6" => "::1"
}
}

(Charlie) #5

Would that work: if you will place a value in a field?

{
"field" => "[{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.1","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}]",
"@version" => "1",
"message" => "{"http.method":"GET","http.path":"/serviceB"},"timestamp":1557911843380103,"parentId":"32f100cf25cd47ad","id":"1b61c2205ff61f8f","name":"get","localEndpoint":{"ipv4":"127.0.0.1","serviceName":"sample"},"traceId":"32f100cf25cd47ad"}",
"host" => "testnode",
"@timestamp" => 2019-05-16T13:50:51.941Z
}

?

#6

OK. You could do it using json_encode (which you will need to install). First copy all the interesting fields to another field, which we can put in [@metadata]

    mutate { remove_field => [ "@timestamp", "@version", "host", "message", "sequence" ] }
    ruby {
        code => '
            event.to_hash.each { |k,v|
                event.set("[@metadata][fields][#{k}]", v)
            }
        '
    }

Then encode it into another field, which again can be in [@metadata]

    json_encode { source => "[@metadata][fields]" target => "[@metadata][string]" }

and output it using a plain codec

output { stdout { codec => plain { format => "[ %{[@metadata][string]} ]
" } } }

Note that you use a literal newline in the format string to tell it to append a newline to the output.

(Vineethpp) #8

Thank you @Badger It worked like a charm.
But i think logstash should have a json object to array of json object converter.

(Vineethpp) #9

Thank you @pastechecker I tried this approach, didnt worked actually. But the below mentioned approach by @Badger worked.