Input file (json format) through logstash to elasticsearch


#1

Hey,

logstash.conf

input {
file {
path => "/root/file.json"
start_position => "beginning"
codec => json
type => "state"
}
}

output {
if [type] == "state" {
elasticsearch {
hosts => ["localhost:9200"]
index => "state-%{+YYYY.MM.dd}"
}
}
}

I am saving the json file (input to logstash) by
curl -k --request GET --url https://xxx --header "x-auth-token: xxx" | jq '.' > file.json

json file looks like

{
"server": {
"status": "ok",
"code": "",
"message": "Operation done successfully."
},
"counts": {
"data_counts": 21,
"total_counts": 21
},
"data": {
"pprc": [
{
"sourcevolume": {
"id": "xxx",
"link": {
"rel": "self",
"href": "xxx"
}
},
"targetvolume": {
"id": "xxx",
"link": {}
},
"targetsystem": {
"id": "xxxx",
"link": {}
},
"type": "globalcopy",
"state": "copy_pending"
},
}
]
}
}

In the kibana dashboard, it looks like reading each line as seperate index

{
"_index": "state-2018.11.16",
"_type": "doc",
"_id": "aZxaHWcB1vLuNnA9uMrP",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-11-16T16:29:03.955Z",
"@version": "1",
"path": "/root/file.json",
"host": "xxxx",
"message": " "type": "globalcopy",",
"tags": [
"_jsonparsefailure"
],
"type": "state"
},
"fields": {
"@timestamp": [
"2018-11-16T16:29:03.955Z"
]
},
"sort": [
1542385743955
]
}

{
"_index": "state-2018.11.16",
"_type": "doc",
"_id": "epxaHWcB1vLuNnA9uMrP",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-11-16T16:29:03.955Z",
"@version": "1",
"path": "/root/file.json",
"host": "xxx",
"message": " "state": "copy_pending"",
"tags": [
"_jsonparsefailure"
],
"type": "state"
},
"fields": {
"@timestamp": [
"2018-11-16T16:29:03.955Z"
]
},
"sort": [
1542385743955
]
}

I added this to logstash filter, no data is sent to elasticsearch

 codec => multiline
        {
            pattern => '^\{'
            negate => true
            what => previous                
        }

I even tried by removing jq filter but did not send data to elasticsearch

curl -k --request GET --url https://xxx --header "x-auth-token: xxx" > file.json

json file

{"server":{"status":"ok","code":"","message":"Operation done successfully."},"counts":{"data_counts":21,"total_counts":21},"data":{"pprc":[{"sourcevolume":{"id":"xxx","link":{"rel":"self","href":"https:xxxx"}},"targetvolume":{"id":"1403","link":{}},"targetsystem":{"id":"xxx","link":{}},"type":"globalcopy","state":"copy_pending"}}]}}

Can anyone help me?


(Walker) #2

You are correct in adding the multi-line codec. Logstash uses new lines as a delimiter for events. For your multi-line pattern, are you using single or double quotes? If single, like your example, change that to double-quotes and if that still doesn't work, remove the backslash, I don't believe{ is a special character that needs to be escaped.


#3

Thanks for your reply.
I will try using multi-line codec.

Why this format is not working the whole file is in a single line

{"server":{"status":"ok","code":"","message":"Operation done successfully."},"counts":{"data_counts":21,"total_counts":21},"data":{"pprc":[{"sourcevolume":{"id":"xxx","link":{"rel":"self","href":"https:xxxx"}},"targetvolume":{"id":"1403","link":{}},"targetsystem":{"id":"xxx","link":{}},"type":"globalcopy","state":"copy_pending"}}]}}


(Walker) #4

That's weird...I didn't realize it was coming in as a single line when I originally responded. I wonder if the curl request is returning data in chunks and logstash is ingesting it before it's completed. You could try changing the file input stat_interval to 5(default is 1s) to see if that improves your situation. Its possible the file is being read before all the data is written to file.


(Christian Dahlqvist) #5

It is not valid JSON as there seems to be a curly brace too many towards the end. Have a look at it using e.g. jsonlint, and you will see.


#6

This worked if I break the line.
Added \n to file.json and applied filters json and split then it worked

input {
file {
path => "file/path/file.json"
type => "state"
start_position => "beginning"
}

}

filter
{
if [type] == "state" {
json {
source => "message"
remove_field => ["message"]
}
split {
field => ["[data][pprc]"]
}
}
}