Input Json file

Hi

I am trying to ingest some of inventory data from JSON files using the following logstash config.

input {
file {
type => "json"
path => "/opt/uploaddata/*.json"
start_position => "beginning"
}
}
filter {
json {
source => "message"
target => "message"
}
}
output {
elasticsearch {
hosts => [ "10.138.7.51:9200" ]
index => "inventory-%{+YYYY-MM-dd}"
}
stdout {}
}

Format of Json file as follow .

{
"_meta": {
"hostvars": {
"host1": {
"foreman": {
"architecture_id": 1,
"architecture_name": "x86_64",
"capabilities": [
"build"
],
"certname": "host1",
"comment": "this is hostname1",
"created_at": "2017-03-08T15:27:11Z",
"disk": "10gb",
"domain_id": 5,
},
"foreman_facts": {
"boardmanufacturer": "Intel Corporation",
"boardproductname": "440BX Desktop Reference Platform",
"ipaddress": "1.1.1.1",
"ipaddress_eth0": "1.1.1.2",
"ipaddress_lo": "127.0.0.1",
},
"foreman_params": {}
},
"host2": {
"foreman": {
"architecture_id": 1,
"architecture_name": "x86_64",
"capabilities": [
"build"
],
"certname": "host2",
"comment": "this hostname2",
"created_at": "2017-03-08T15:27:11Z",
"disk": "20gb",
"domain_id": 5,
},
"foreman_facts": {
"boardmanufacturer": "Intel Corporation",
"boardproductname": "440BX Desktop Reference Platform",
"ipaddress": "2.1.1.1",
"ipaddress_eth0": "2.2.2.2",
"ipaddress_lo": "127.0.0.1",
},
"foreman_params": {}
},
"all": [
"host3",
"host4",
],
"foreman_environment: [
"computer1",
"computer2"
],
So only interested in hostvars and index the document based on the hostname and ingest is as follow and ignore all and foreman_environment

Elastic doc id 1

computer name : "host1"
"architecture_id": 1,
"architecture_name": "x86_64",
"capabilities": ["build"],
"certname": "host1",
"comment": "this is hostname1",
"created_at": "2017-03-08T15:27:11Z",
"disk": "10gb",
"domain_id": 5,
"foreman_facts": {
"boardmanufacturer": "Intel Corporation",
"boardproductname": "440BX Desktop Reference Platform",
"ipaddress": "1.1.1.1",
"ipaddress_eth0": "1.1.1.2",
"ipaddress_lo": "127.0.0.1",

Elastic doc id 2

"computer name"" : "host2"
"architecture_id": 1, 
"architecture_name": "x86_64", 
"capabilities": ["build"], 
"certname": "host2", 
"comment": "this hostname2", 
"created_at": "2017-03-08T15:27:11Z", 
"disk": "20gb", 
"domain_id": 5, 
"boardmanufacturer": "Intel Corporation", 
"boardproductname": "440BX Desktop Reference Platform", 
"ipaddress": "2.1.1.1", 
"ipaddress_eth0": "2.2.2.2", 
"ipaddress_lo": "127.0.0.1", 

However I am getting following error when ingest the data using the above config and also elasticsearch add every json field as separate document .

LogStash errors:

at [Source: [B@45247ee5; line: 2, column: 3]>}
15:27:46.641 [[main]>worker0] WARN logstash.filters.json - Error parsing json {:source=>"message", :raw=>" "_meta": {\r", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: [B@49d56f1d; line: 1, column: 11]>}
15:27:46.641 [[main]>worker0] WARN logstash.filters.json - Error parsing json {:source=>"message", :raw=>" "hostvars": {\r", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: [B@7a32a610; line: 1, column: 16]>}
15:27:46.642 [[main]>worker0] WARN logstash.filters.json - Error parsing json {:source=>"message", :raw=>" "host1": {\r", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: [B@5137768b; line: 1, column: 15]>}
15:27:46.642 [[main]>worker0] WARN logstash.filters.json - Error parsing json {:source=>"message", :raw=>" "foreman": {\r", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: [B@685351f2; line: 1, column: 19]>}
15:27:46.643 [[main]>worker0] WARN logstash.filters.json - Error parsing json {:source=>"message", :raw=>" "architecture_id": 1, \r", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

Elastic document :

image

Please help in solving the above issues , I have also tried using json_lines codec , however logstash does not even ingest single message with this Codec and dont see any activity on logstash window .

As I wrote in another topic yesterday: The file input reads files line by line. If you want to slurp an entire file into a single event you need to use a multiline codec. I'm sure examples have been posted in the past.

thanks Magnus ,

Wondering if you have link for the Topic you write yesterday

Also want to confirm what if we ingest the same json data from Filebeat , do we still need to use the muiltiline codec . As my final solution is to get the json file ingested via filebeat and send to logstash to make some changes on the format before sending it to elasticsearch

Generally, you will have a tough time with multiline or pretty-printed JSON in LS, this is because it is hard to know when the last } is reached especially if there is no newline after the } character.

When you use the pattern => "^}$" (a brace at the start of a line ) with what => "previous" the text satisfying the pattern is not seen as belonging to the previous and the accumulation of JSON lines has no final }.

Having said that a pattern => "^{$" with what next relies on a timeout flush feature to flush the accumulation buffer because another \n{ is never seen. Filebeat's implementation of this timeout is better than the LS one.

Also want to confirm what if we ingest the same json data from Filebeat , do we still need to use the muiltiline codec

You need to use Filebeat's multiline feature. Not the multiline codec on the Logstash side.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.