Logstash Config File json

jpsphar1 · January 11, 2016, 7:32pm

Having issues with the following config file after following a tutorial from someone on the web. Main goal was to take a json file and load into logstash and bring over all fields from the original json file. My input is the json file and the output is elastic search. I tried adding each field as a source => "Something" and then run a mutate {convert => ["Latitude", "float"]} command to convert the double fields into floats so they could be used with Kibana from what I followed. Issue is the config file with all of the fields in like below won't work but if I remove all fields and don't mutate anything I can get the data into logstash just using one source command called source=> "Provider" ----

This config doesn't work:
input {
file {
path => "D:\ElasticSearch\EEN_JSON_TEST\OUTPUT\output2016_01.json"
type => "json"
start_position => "beginning"
}
}
filter{
json
{
source => "Latitude"
source => "Longitude"
source => "Accuracy"
source => "DateTime"
source => "Provider"
source => "Bearing"
source => "Acce_X"
source => "Acce_Y"
source => "Acce_Z"
source => "Orient_X"
source => "Orient_Y"
source => "Orient_Z"
source => "TimeRaw"
source => "DeviceID"
source => "Manufacturer"
source => "GeoTags"
source => "Speed_Cal"
source => "Exc_Model_Cal"
source => "Operator_Cal"
source => "Risk_Dyn_Cal"
source => "Speed"
source => "Model"
source => "Version"
source => "Username"
source => "ValID"

}
    mutate {convert => ["Latitude", "float"]}
    mutate {convert => ["Longitude", "float"]}
    mutate {convert => ["Speed_Cal", "float"]}
    mutate {convert => ["Risk_Dyn_Cal", "float"]}
    mutate {convert => ["Speed", "float"]}

}

output {
elasticsearch {
action => "index"
host => "localhost"
index => "EEN"
}
stdout {}
}

This config does but not sure it is correct and not sure it allows me to do what I want to do in Kibana:
input {
file {
path => "D:\ElasticSearch\EEN_JSON_TEST\OUTPUT\output2016_01.json"
type => "json"
start_position => "beginning"
}
}
filter{
json
{

    source => "Provider"
     
}

}

output {
elasticsearch {
action => "index"
hosts => "localhost"
index => "eenindigo"
}
stdout {}
}

Thanks in Advance. Jason

magnusbaeck · January 11, 2016, 9:24pm

Can you give an example of a line from output2016_01.json?

jpsphar1 · January 11, 2016, 9:30pm

Here you go, this is what it looks like when I open it in Visual Studio Code:

[{
"Longitude" : 0.0,
"Latitude" : 0.0,
"Accuracy" : "0.0",
"Speed" : 0.0,
"DateTime" : 1452110632276,
"provider" : "0",
"bearing" : "0.0",
"Acce_X" : "0.035913024",
"Acce_Y" : "0.39085343",
"Acce_Z" : "9.865907",
"Orient_X" : "-2.7703679",
"Orient_Y" : "-0.03959561",
"Orient_Z" : "-0.003640098",
"TimeRaw" : "1452110631000",
"DeviceID" : "990005871041100",
"Manufacturer" : "samsung",
"Model" : "SM-G900V",
"Version" : "5.0",
"ValID" : "CODES-9983-7032-5001",
"Username" : "GPS EEN Device 2",
"Geometry" : {
"x" : 0.0,
"y" : 0.0,
"z" : 0.0,
"spatialReference" : {
"wkid" : 4326
}
}
},

magnusbaeck · January 12, 2016, 6:44am

So, the top-level JSON entity is an array and it's spread over multiple lines? Or do you have multiple such array in the file, separated by newline characteres? Regardless, Logstash is not going to like it. You'd have to use a multiline filter or codec to join all lines into a single line that you might be able to feed to a json filter, but I'm not completely sure it really likes arrays.

jpsphar1 · January 12, 2016, 2:58pm

Yeah I am not entirely sure since I am new to this stuff. Essentially, I have a json file that has all of those fields that you saw above and I want to move them into logstash. I thought I read that json files are native to logstash and so that possibly just having an input and an output in the config file would work OK to move the json data into elasticsearch via logstash. The confusion I am having is what code is needed and what code is not just to move the data over and furthermore, what additional code is needed to convert some of my double fields to floats so that I can use them in kibana. I really thought that I could have input,filter and output and in my case the filter would only be used to mutate the double fields to float. I was planning to test the following code here soon to see if this theory would work:

input {
file {
path => "D:\ElasticSearch\EEN_JSON_TEST\OUTPUT\output2016_01.json"
type => "json"
start_position => "beginning"
}
}
filter {
json {
mutate {convert => ["Latitude", "float"]}
mutate {convert => ["Longitude", "float"]}
mutate {convert => ["Speed_Cal", "float"]}
mutate {convert => ["Risk_Dyn_Cal", "float"]}
mutate {convert => ["Speed", "float"]}
}
}

output {
elasticsearch {
action => "index"
hosts => "localhost"
index => "eendex"
}
stdout {}
}

So far I have been able to load the data from the json to elastic search via logstash just using the source=> command but not sure if it is being stored correctly in logstash. All the fields in the json file seem to have come over in kibana but they show up as individual lines for each field rather than one long log that has all fields listed. Not sure what is correct.

Sorry, sort of vague here.

Jason

jpsphar1 · January 15, 2016, 4:01pm

Hey Magnus, I found some stuff online that allowed me to get all of my fields nested into one json document inside of logstash and I was able to do a few data searches. But now when I try to do a geo distance search I realize my lat/long is not in one field and also not of the proper float type. So, I followed some of the logic in this article http://david.pilato.fr/blog/2015/04/28/exploring-capitaine-train-dataset/ in order to attempt to append to my filter but something doesn't seem to be working exactly right. Any chance you could look at the following code and let me know if anything stands out? Thanks.

input
{
file
{
codec => multiline
{
pattern => '^{'
negate => true
what => previous
}
path => "D:\ElasticSearch\EEN_JSON_TEST\OUTPUT\eendexify2016_01.json"
start_position => "beginning"
exclude => "*.gz"
}
}

filter {

mutate{
    convert => { "longitude" => "float" }
    convert => { "latitude" => "float" }
}
mutate{
    rename =>{
        "longitude" => "[location][lon]"
        "latitude" => "[location][lat]"
    }
}     

mutate
{
    replace => [ "message", "%{message}}" ]
    gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/ 
{
    json { source => message }
}

}
output
{
elasticsearch {
action => "index"
codec => json
hosts => "localhost"
index => "eendexeon"
}

stdout { codec => dots }

}

magnusbaeck · January 17, 2016, 4:20pm

It would be easier to help if you'd post an example input message, the result you get, and the result you expected. However, there are a few things that look odd.

replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']

What's the point of this, and why is there an extra closing brace on the first line? Also, keep in mind that the mutate filter's different functions execute in a fixed order regardless of the order in which you list things in the config file. If it's important that the replace operation executes before the gsub operation you need to use two consecutive mutate filters.

if [message] =~ /^{.*}$/

Curly braces are metacharacters in regular expressions. While probably not necessary in this case it's a good habit to always escape them.

Finally, Logstash's default index template doesn't map the location field as a geo_point field. See the ES documentation for information about what a field needs to look like to be acceptable as a geo_point (which still requires the field to be mapped as such).

jpsphar1 · January 18, 2016, 5:04pm

Thanks for the response. Here is a sample input coming from ArcGIS as a json document, please disregard that the lat and long are zero - device is collecting inside a building at the moment.
{
"Longitude" : 0.0,
"Latitude" : 0.0,
"Accuracy" : "0.0",
"Speed" : 0.0,
"DateTime" : 1453132207070,
"provider" : "0",
"bearing" : "0.0",
"Acce_X" : "0.10594343",
"Acce_Y" : "-0.046686932",
"Acce_Z" : "10.21127",
"Orient_X" : "0.086673655",
"Orient_Y" : "0.0045718206",
"Orient_Z" : "-0.010374774",
"TimeRaw" : "1453132205000",
"DeviceID" : "990004889678235",
"Manufacturer" : "samsung",
"Model" : "SM-G900V",
"Version" : "5.0",
"ValID" : "CODES-9983-7032-5001",
"Username" : "GPS EEN Device 6",
"Geometry" : {
"x" : 0.0,
"y" : 0.0,
"z" : 0.0,
"spatialReference" : {
"wkid" : 4326
}

The output in Kibana is not looking the same it did before when I ran it the first few times but when I used the code I gave you in the previous post the I was able to see all of the fields above nested into one message as a single entry. This is what I expected to see but had hoped to have some usable fields to search by but none of my original fields from the json were usable in kibana or searchable in sense.
I then tried to map all of my fields first with the proper field types using the field mappings and although they would show in kibana under my index the data from the json was not being parsed into these preset fields.

Essentially I just want to use the json input detailed above, load it into logstash and ES and make the proper fields such as lat/long as well as others usable for searching and visualization. I see your point about making the location field as a geopoint so I guess I would have to map this field as so before sending data to my index? or add it and mutate the data in my config file?

The code I sent you before was from someone on the internet that I have been using because it successfully mapped all of my fields into one message and showed that way in kibana which I thought was correct so I kept it. But now when I look in kibana after trying more tests it looks like it is trying to put multiple records in one message and then I get the multiline_codec_max_lines_reached at the end. I have put a few images below to show what I am seeing.

If you knew of a simple config files on the web that would parse a multi line json file like mine and mutate the proper fields so they could be searched and used in Kibana. As well, store each record coming in as a separate message I believe that would be what I was looking for. If it is not obvious, I am really new to this. thanks !

magnusbaeck · January 18, 2016, 6:11pm

Yeah, you'd have to tweak the multiline filter so that it recognizes that "}, {" marks the end of the current message. It won't be completely trivial since that line needs to result in "}" at the end of the first message and "{" at the beginning of the next. I'm afraid I don't have time to work on this.

amol_sonawane1 · November 22, 2016, 7:20am

I am working on large json data.
I am using ELK stack with 5.0 version.
I am getting multiline_codec_max_lines_reached when json data is large and works file if some data is removed from the same.
I changed the heap size in jvm.options file to 4GB for both elasticsearch and logstash.
Still I am getting same exception and not able to file solution.
Please help me with same.

magnusbaeck · November 22, 2016, 8:00am

@amol_sonawane1, please start a new thread for your unrelated question.

Topic		Replies	Views
How to mutate convert field in Json file input Logstash	1	1243	June 24, 2017
Logstash file for parsing JSON data Logstash	6	7085	July 6, 2017
Passing data from file through Logstash to Kibana Logstash	4	2201	January 27, 2018
Json logstash Logstash	6	784	August 14, 2017
Help me on logstash conf for Json input file Logstash	1	220	November 19, 2020

Logstash Config File json

Related topics