Logstash Process Hangs - No Logging


(Andreas H) #1

Hello everyone.
I've got the strangest problem.
I'm using multiple logstash pipelines to monitor different folders for log file input. The idea is to have a support person be able to drop a file in a website and have a dashboard generated for them after logstash is done ingesting the data.
This is working great! Here is my sample input, which is pretty much the same for most of my files (there's one config with a multiline):

input {
    file {
		id => "performancelog_input_file"
		path => ["/var/log/performancelog/client_performance_*_*.log*"]
		type => "plain"
		mode => "read" #start_position => "beginning" cose_older => true
		sincedb_path => "/dev/null"
		file_completed_action => "log_and_delete"
		file_completed_log_path => "/var/log/performancelog/processedFiles.log"
		file_sort_by =>  "last_modified"
		file_sort_direction => "asc"
    }
}

However, once every few days logstash seems to totally lock up and doesn't process any more files.
Logging stops (changing log levels to DEBUG doesn't show additional output while it's 'hanging').
Anyone have any ideas or tips on what kind of log levels I can set to hopefully try and capture what's going on the next time it 'hangs'? Or some ideas on what I should set in my logstash.yml to have it 'wake up'?

I've thought about using filebeat for the simple line by line logs (not the multiline one). Is it any more stable?

Thanks so much!


(Guy Boertje) #2

As the author of the file input changes, I am interested in getting to the bottom of why read mode looks broken after some time.

While developing the code, I did not run an instance for many days.

Please make sure you are using version 4.1.6 of the file input.

Filebeat does not yet have a read mode - we are waiting to iron out any kinks first in LS.

This may be hard to recreate.

At the time it "hangs", are the existing files fully read? I'm guessing you detect this situation when a file is dumped in the folder and it does not appear as a dashboard.

When this happens, you can turn on trace logging via the rest API.
First do curl -XGET 'localhost:9600/_node/logging?pretty'
You see something like this:

{
  "host" : "Elastics-MacBook-Pro.local",
  "version" : "6.4.0",
  "http_address" : "127.0.0.1:9600",
  "id" : "8789409b-7126-4034-9347-de47e6ce12a9",
  "name" : "Elastics-MacBook-Pro.local",
  "loggers" : {
    "filewatch.discoverer" : "DEBUG",
    "filewatch.observingtail" : "DEBUG",
    "filewatch.sincedbcollection" : "DEBUG",
    "filewatch.tailmode.handlers.createinitial" : "DEBUG",
    "filewatch.tailmode.processor" : "DEBUG",
    "logstash.agent" : "DEBUG",
    "logstash.api.service" : "DEBUG",
    "logstash.codecs.json" : "DEBUG",
    ...
    "logstash.filters.date" : "DEBUG",
    "logstash.inputs.file" : "DEBUG",
    ...
    "logstash.outputs.stdout" : "DEBUG",
    "logstash.pipeline" : "DEBUG",
    ...
    "slowlog.logstash.codecs.json" : "TRACE",
    "slowlog.logstash.codecs.rubydebug" : "TRACE",
    "slowlog.logstash.filters.date" : "TRACE",
    "slowlog.logstash.inputs.file" : "TRACE",
    "slowlog.logstash.outputs.stdout" : "TRACE"
  }
}

The ones you need to change are prefixed with filewatch

curl -XPUT 'localhost:9600/_node/logging?pretty' -H 'Content-Type: application/json' -d'
{
    "logger.filewatch.discoverer" : "TRACE"
}
'

Now you will get more information on what the discoverer is doing.
e.g.

[...][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>false, "watched_file details"=>"<FileWatch::WatchedFile: @filename='testing-b.txt', @state='active', @recent_states='[:watched, :watched]', @bytes_read='11', @bytes_unread='0', current_size='11', last_stat_size='11', file_open?='true', @initial=false, @sincedb_key='10211069 1 4'>"}
[...][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>false, "watched_file details"=>"<FileWatch::WatchedFile: @filename='testing-a.txt', @state='active', @recent_states='[:watched, :watched]', @bytes_read='22', @bytes_unread='0', current_size='22', last_stat_size='22', file_open?='true', @initial=false, @sincedb_key='10211020 1 4'>"}

(The above is tail mode) The above tells me that it "knows" about two files, they are not new, they are actively being monitored for new content, they are fully read, the file handle is open etc.
A fresh new dropped file should appear as new discovery"=>true, @state='active' tells me that the file is in the Active state.

You can set the logging back to INFO for discoverer. Now move on to filewatch.readmode.processor, set its log level to TRACE.

You should see the processor moving though the Watched and Active files. Files that are "done" are neither Watched nor Active. A file is moved from the Watched state to the Active state when there is a slot in the max_open_files window - for total file count of < 4095 a new file should go to Active pretty quickly and then the filewatch.readmode.handlers.read_file takes over.

Do you, by any chance, see 4095 files in your completed log when the "hang" occurs?


(Andreas H) #3

Thank you for the detailed response Guy! I will certainly try to turn on this logging and let you know what I find out.
To answer your final question, it is unlikely that the value would be over 4095 as I've only ever had approximately 100 files go through this particular pipelines since its creation. However, I will check and let you know.
Thank you again, I'll keep this post updated.


(Andreas H) #4

Hello Guy
Maybe I have the whole idea of "read" setting wrong.
If a file is read and then is completed, normally the line number would be recorded in the since_db, however I want each file to be read from the beginning each time it is uploaded (even if a file with the same name was read once before).
Ie:
File is uploaded...logstash processes it... deletes it... same file is uploaded again (the whole thing is read again)..deleted.. rinse and repeat.

This is the output I get from the filewatch.discoverer TRACE

[2018-10-05T10:11:47,870][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_RADAGDEMODS001.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='3986521', last_stat_size='3986521', file_open?='false', @initial=true, @sincedb_key='67 0 2053'>"}
[2018-10-05T10:11:47,880][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_RADAGHCDS003.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='1071210', last_stat_size='1071210', file_open?='false', @initial=true, @sincedb_key='61 0 2053'>"}
[2018-10-05T10:11:47,881][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_IMGTHMCPC843.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='2003529', last_stat_size='2003529', file_open?='false', @initial=true, @sincedb_key='68 0 2053'>"}
[2018-10-05T10:11:47,882][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_RADAGMCDS238.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='154945', last_stat_size='154945', file_open?='false', @initial=true, @sincedb_key='73 0 2053'>"}
[2018-10-05T10:11:47,883][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_RADAGTEST001.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='102427', last_stat_size='102427', file_open?='false', @initial=true, @sincedb_key='98 0 2053'>"}
[2018-10-05T10:11:47,884][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>true, "watched_file details"=>"<FileWatch::WatchedFile: @filename='client_performance_CAN-TESTSITE-ON_RADAGMCBIDS004.log', @state='watched', @recent_states='[:watched]', @bytes_read='0', @bytes_unread='0', current_size='573090', last_stat_size='573090', file_open?='false', @initial=true, @sincedb_key='59 0 2053'>"}

So the files are in a "WATCHED" state.

When I turn on the readmode.processor TRACE then I see that the files are constantly flipping back and forth between Active and Watching

[2018-10-05T10:15:33,674][TRACE][filewatch.readmode.processor] Active processing
[2018-10-05T10:15:34,098][TRACE][filewatch.readmode.processor] Watched processing
[2018-10-05T10:15:34,098][TRACE][filewatch.readmode.processor] Active processing
[2018-10-05T10:15:34,674][TRACE][filewatch.readmode.processor] Watched processing
[2018-10-05T10:15:34,674][TRACE][filewatch.readmode.processor] Active processing

My processeFiles (completed) log is only showing 50 files processed so far:

root@radagc1syndb001:/var/log/impaxlog# cat processedFiles.log | wc -l
50

Oh and setting this logger into trace does not show any output in my log.

curl -XPUT 'localhost:9600/_node/logging?pretty' -H 'Content-Type: application/json' -d'
{
"logger.filewatch.readmode.handlers.read_file" : "TRACE"
}
'

Is this the correct behaviour? Am I using this plugin incorrectly?


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.