Ingest multiple json files under a folder

decodeit · July 11, 2019, 1:33pm

I am trying to make logstash ingest multiple json files under a folder. Each json file is multiline about 100 line.

I don't see any out put .
here is my config.

`input {
file{    
	path => "/data/nhs.json"
	start_position => "beginning"
	sincedb_path => "/dev/null"
codec =>json
}

}

stdout {
codec => rubydebug
}
}
`

Badger · July 11, 2019, 1:47pm

A file input reads a file one line at a time. A json codec will only work if each line is a complete JSON object. If the JSON is spread across multiple lines you would need a multiline codec.

You say you want to ingest multiple files but your path option is not a wildcard, so it refers to a single file.

decodeit · July 11, 2019, 1:48pm

Hi, yes my mistake on the wildcard. I had it *.json before while running.
I will try multiline codec and update this thread.

decodeit · July 11, 2019, 1:52pm

this is the example of my one of the json file. Could you give me an example how my multiline codec should look like.

`{
"codeBook": {
	"docDscr": {
		"citation": {
			"titlStmt": {
				"titl": "International Travel Survey: Canadians, 2016 [Canada]",
				"altTitl": "ITS 2016: Canadians"
			},
			"holdings": {
				"_location": "Statistics Canada. Data Liberation Initiative",
				"_URI": "http://dli-idd-nesstar.statcan.gc.ca/webview/"
			}
		},
		"docSrc": {
			"titlStmt": {
				"titl": "ITS-2016-Canadians-E"
			}
		}
	},
	"stdyDscr": {
		"citation": {
			"titlStmt": {
				"titl": "International Travel Survey: Canadians, 2016 [Canada]",
				"altTitl": "ITS 2016: Canadians"
			}
		},
		"stdyInfo": {
			"subject": {
				"keyword": [
					"Accommodation",
					"Age",
					"Air transport",
					"Automobile travel",
					"Balance of payments",
					"Beverages",
					"Border crossings",
					"Country of origin",
					"Data collection",
					"Data quality",
					"Entertainment",
					"Expenditures",
					"Fares",
					"Food expenditures",
					"Food purchased from restaurants",
					"Length of stay",
					"Modes of transport",
					"Nonresponse rate",
					"Place of residence",
					"PUMFFILE",
					"Questionnaires",
					"Rail transport",
					"Receipts",
					"Recreation",
					"Sex",
					"Surveys",
					"Tours",
					"Travellers",
					"Travel"
				],
				"topcClas": [
					"International travel",
					"Tourism indicators",
					"Travel and tourism"
				]
			},`

Badger · July 11, 2019, 2:06pm

If you want to ingest the entire file as a single event then use a multiline codec that never matches and a timeout

codec => multiline { pattern => "^Spalanzani" negate => true what => previous auto_flush_interval => 1 }

If you have multiple objects which are pretty-printed, so that a line just containing } indicates the end of an object use

codec => multiline { pattern => "^}" negate => true what => next auto_flush_interval => 1 }

decodeit · July 11, 2019, 2:15pm

So here is my config, but i still dont see any output. Any mistake you see on this:

`input {
stdin{    
	path => "/data/*.json"
	start_position => "beginning"
	sincedb_path => "/dev/null"

codec => multiline { pattern => "^Spalanzani" negate => true what => previous auto_flush_interval => 1 }
}

}

filter {
json {
source => "[message]"
remove_field => ["[message]"]
}
}

stdout {
codec => rubydebug
}
}
`

Badger · July 11, 2019, 2:21pm

Really? A stdin input?

decodeit · July 11, 2019, 2:23pm

sorry I don't understand. I was looking at this:

`input {

stdin {
codec => multiline {
pattern => "pattern, a regexp"
negate => "true" or "false"
what => "previous" or "next"
}
}
}`

Badger · July 11, 2019, 2:27pm

You are asking us to comment on your configuration, and the configuration you showed us was a stdin input with a path, sincedb_path, and start_position options. That would result in compilation errors at startup.

decodeit · July 11, 2019, 2:30pm

yes I am getting an error. logstash exited with code 0. Do I need to replace stdin with anything else. I had this config when I was ingesting a csv file.

Badger · July 11, 2019, 2:32pm

I would suggest using a file input if you want to read multiple files.

system · August 8, 2019, 2:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Best way to ingest JSON files? Logstash	6	1656	July 6, 2017
How to ingest multiple json documents from a folder into ES using logstash Logstash	2	2128	July 6, 2017
Ingesting JSON files, format problem? Logstash	13	4096	July 6, 2017
Nested Json parse failure in logstash? Logstash	9	1407	September 12, 2019
Multiple codecs? running multiline codec during filter? Logstash	5	796	September 3, 2019

Ingest multiple json files under a folder

Related topics