Extract Data from message to display each field as a column in kibana

Prabh_Natt · March 7, 2018, 4:26pm

I want to be able to extract the fields i need from message and be able to select them as their own fields, as well as index everything dynamically using the "clientCode". I have been working on this for the past couple of days and i'm stuck. Help is greatly appriciated
This is my config file:

input {
file {
path => ["C:\logstash-6.2.2\conversation_stats\conversation_stats.json"]
start_position => "beginning"
sincedb_path => "/dev/null"
ignore_older => 0
}
}
filter{
grok{
match=>{"message"=>
"%{DATA:id}
%{DATA:clientCode}
%{DATA:conversationID}
%{INT:employeeID}
%{DATA:entities}
%{DATA:input}
%{DATA:intents}
%{DATA:locale} "}
}
mutate{
gsub =>["message","[:<>.,]",""]
}
if[message]!="(null)"{
json{
source=>"message"
target=>"jmessage"
}
}
mutate{remove_field=>["message"]}

}
output{
stdout {codec=>rubydebug}
elasticsearch{
action =>"index"
hosts =>["localhost:9200"]
index =>"test-%{clientCode}"
}
}

Sample error i'm getting in cmd:

[2018-03-07T11:09:37,402][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"test-%{clientCode}", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x737c4bbc], :response=>{"index"=>{"_index"=>"test-%{clientCode}", "_type"=>"doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"invalid_index_name_exception", "reason"=>"Invalid index name [test-%{clientCode}], must be lowercase", "index_uuid"=>"na", "index"=>"test-%{clientCode}"}}}}
{
"tags" => [
[0] "_grokparsefailure",
[1] "_jsonparsefailure"
],
"@timestamp" => 2018-03-07T16:09:36.569Z,
"path" => "C:\logstash-6.2.2\conversation_stats\conversation_stats.json",
"@version" => "1",
"host" => "MRK-06576"
}

Here is sample data from my json file:

{
"_id" : ObjectId("5a21e54533015"),
"clientCode" : "demo",
"conversationId" : "d6416ec0--930f-da9f3215",
"employeeId" : "45",
"entities" : [
{
"entity" : "status",
"location" : [
NumberInt("0"),
NumberInt("2")
],
"value" : "ok",
"confidence" : NumberInt("1")
}
],
"input" : {
"feedback" : {
"feedbackSubject" : "my feedbac",
"feedbackText" : "feedback\nthis is good\nI love this",
"feedbackCategory" : "",
"conversationId" : "d6416ec0--930f-da9f3215",
"conversationText" : "(HI) [Greetings, human.]",
"conversationNodeName" : "root"
}
},
"intents" : [
{
"intent" : "feedbackresponse",
"confidence" : NumberInt("1")
}
],
"locale" : "en-ca"
}

Badger · March 7, 2018, 5:48pm

If the JSON is all on one line then the following is all you need to parse it. If it is not all on one line then there are lots of threads that discuss how to use multiline codecs.

  mutate {
    gsub => [ "message", 'NumberInt\("([0-9]+)"\)', "\1" ]
    gsub => [ "message", 'ObjectId\("([a-z0-9]+)"\)', '"\1"' ]
  }
  json { source => "message" }

Prabh_Natt · March 7, 2018, 8:20pm

I tried using what you suggested but now it gives me a grok parse error and a json parse error

Badger · March 7, 2018, 8:26pm

Why are you using grok?
Can you show the rubydebug output?

Prabh_Natt · March 7, 2018, 8:33pm

i'm kinda new to this and grok seemed like the best fit for me, i'm using it to grab the data from my json file
and here is the output of ruby debug:

{
"tags" => [
[0] "_grokparsefailure"
],
"host" => "MRK-06576",
"@timestamp" => 2018-03-07T20:35:27.624Z,
"@version" => "1",
"message" => "\t"input" : {",
"path" => "C:\logstash-6.2.2\conversation_stats\conversation_stats.json"
}
{
"tags" => [
[0] "_grokparsefailure"
],
"host" => "MRK-06576",
"@timestamp" => 2018-03-07T20:35:27.624Z,
"@version" => "1",
"message" => "\t\t",
"path" => "C:\logstash-6.2.2\conversation_stats\conversation_stats.json"
}

Badger · March 7, 2018, 8:59pm

If your input is valid JSON, or even close to it, then a json filter is most likely better than grok. Now, for the sample data you showed in the first post, you need to configure the input so that rubydebug shows the entire JSON object in a single event. Like this...

"message" => "{ \"_id\" : ObjectId(\"5a21e54533015\"), \"clientCode\" : \"demo\", \"conversationId\" : \"d6416ec0--930f-da7aa79f3215\", \"employeeId\" : \"45\", \"entities\" : [ { \"entity\" : \"status\", \"location\" : [ NumberInt(\"0\"), NumberInt(\"2\") ], \"value\" : \"ok\", \"confidence\" : NumberInt(\"1\") } ], \"input\" : { \"feedback\" : { \"feedbackSubject\" : \"my feedbac\", \"feedbackText\" : \"feedback\\nthis is good\\nI love this\", \"feedbackCategory\" : \"\", \"conversationId\" : \"d6416ec0-2f9a-42fb-930f-da7aa79f3215\", \"conversationText\" : \"(HI) [Greetings, human.]\", \"conversationNodeName\" : \"root\" } }, \"intents\" : [ { \"intent\" : \"feedbackresponse\", \"confidence\" : NumberInt(\"1\") } ], \"locale\" : \"en-ca\" }",

as opposed to what you have now, which is

"message" => "\t"input" : {",

If you want to consume the entire file as a single event then you will need to use a multiline code. You can use the trick described here, of appending a line that is known not to occur in the input. Some people will recommend using auto_flush_interval, but personally I think that is an ugly hack.

Prabh_Natt · March 8, 2018, 3:09pm

How would i get the message to display like that using the pattern field, i can't figure out the regex for it. still kinda new to all this.
the sample data i posted above it straight out of the json file i'm working with, the rest of the data in the file is similar.

Badger · March 9, 2018, 7:31pm

As I said, if you want to consume multiple lines of JSON from the file as a single event then you will need to use a multiline code.

Prabh_Natt · March 9, 2018, 8:21pm

What i want to be able to see in kibana when i look at all possible fields is things like:
message.input
message.clientCode
message.id
would this be possible with the multiline codec, because i gave it a shot and it didn't work.

Badger · March 9, 2018, 9:56pm

Yes, and I linked to a post with an example of doing that.

system · April 6, 2018, 9:56pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Extract json fields from message Logstash	6	840	May 7, 2021
Extract field's from a message field Logstash	1	216	June 22, 2021
Extract JSON fields from Message Logstash	1	73	September 20, 2024
Extract string from message field kibana Kibana	5	9387	February 21, 2018
Elastic Json parse into logstash Logstash	30	1210	November 22, 2018

Extract Data from message to display each field as a column in kibana

Related topics