Logstash Not processing file in a folder simaltaneously


(Hari Haran) #1

Hello All,

Logstash is not processing all files in a folder simultaneously/parallelly. It is processing one after the other.

I am using one configuration file.

Example:

input {
file {
path => "/inputfiles/record*.txt"
start_position => "beginning"
close_older => 1
}
}

inputfiles folder has three files
record1,record2,record3.

I have used sysout to see how it is indexing and picking file. it is processing one file after other file.

For example 3 million records it took 30 minutes. Even when i placed three files with three million records in a folder it took around 90 minutes. Of course that is because it processes one after the other.

Thanks


(Guy Boertje) #2

The docs say this:

In some cases it is useful to be able to control which files are read first, sorting, and whether files are read completely or banded/striped. Complete reading is all of file A then file B then file C and so on. Banded or striped reading is some of file A then file B then file C and so on looping around to file A again until all files are read. Banded reading is specified by changing file_chunk_count and perhaps file_chunk_size . Banding and sorting may be useful if you want some events from all files to appear in Kibana as early as possible.