Logstash2.4 File Input doesn't work

Hi,

I'm running logstash 2.4 as a service in Ubuntu 14.04.

Here is my config:
input {
file {
path => "/mnt/resource/hadoop/yarn/log///stderr"
start_position => beginning
stat_interval => 300
discover_interval => 0
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => "previous"
auto_flush_interval => 10
max_lines => 1000000
max_bytes => "1 GiB"
}
}
}
filter {
grok {
match => ["path","%{GREEDYDATA}/%{GREEDYDATA:filename}.log"]
}
}
output {
azureblob {
storage_account_name => "oasparkintegratedata"
storage_access_key => "j2ybUFO667hiHBB+lttAv18JWAluoCiuM2cvEWi5Sg7sDGNC+16+wRxRkXc1MmwxzG/x768RTUeffLqOm03aNw=="
azure_container => "test"
}
}

When the service starts, it shows the following information:
{:timestamp=>"2016-11-15T09:54:24.184000+0000", :message=>"Starting pipeline", :id=>"main", :pipeline_workers=>1, :batch_size=>125, :batch_delay=>5, :max_inflight=>125, :level=>:info}
{:timestamp=>"2016-11-15T09:54:24.185000+0000", :message=>"Pipeline main started"}

Then, it shows nothing new in both logstash.err and logstash.log files. I tried to remove .sincedb file and the monitored folder is keep writing new lines, but still no events found by logstash. What's the problem here?

I seriously doubt it's a good idea to have discover_interval => 0.

I suggest you bump the log level with --verbose or even --debug to get additional clues about what the file input is doing.

Sorry for late reply. I'm forced to set discover_interval => 0 as for new file discover time is also related to stat_interval. If I set large discover interval, it won't find new files for a long period.

I set log level to --verbose, the logs didn't show any useful error messages. The logs flooding when set level to --debug. I can't catch useful clues till now.

Indeed, the file input can actually find some files under the given path, but it also can't find some files. Once, I found some files were not caught by the input, I did nothing just run command "service logstash reload", then it worked for the unfound files. After sometime, it can't find part of files again. It's really strange and I have no idea how to resolve it.

Logstash logs which files it finds when it expands the filename patterns. Is that list correct? If you have a large file churn maybe your problem is inode number reuse.

Yes, it's able to find some of files under the filename patterns. I found the .sincedb file had record of a file's inode number, but indeed it didn't found in the output. Dose this phenomenon related to inode number reuse? Why won't file input continue to handle file with the inode number in the .sincedb file?

Yes, it's able to find some of files under the filename patterns.

Some of the files? Or all files?

I found the .sincedb file had record of a file's inode number, but indeed it didn't found in the output. Dose this phenomenon related to inode number reuse? Why won't file input continue to handle file with the inode number in the .sincedb file?

According to the sincedb file Logstash has already processed the file. Logstash's file input isn't able of deleting old entries from sincedb but I think Filebeat is.

It can only find some files.

I found a new issue that after running a period of time, file input can't find new file under the given pattern any more.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.