File input, from NetApp CIFS share, not reading single file

Hi,
I've made quite a few other posts, and have successfully ingested more than 1 billion messages in the last couple weeks, but we've come to the conclusion that Logstash just isn't keeping up with the backlog of messages. The reason I say this is that Logstash will seemingly lose its place in the ingestion, and start over, randomly.

What we have is NetApp CIFS audit logs turned on, and writing to a CIFS share on itself (not in the audit logs). There are 10,001 files in this directory - a single current file, and 10,000 previous files; each 100MB in size; or approximately 77,000 lines. I had the input configured to /mnt/cifs/*.xml, and this was working, for the most part like I said above - hundreds of millions of messages have been ingested so far. But since it wasn't catching up, I opted to change the input to the single current file. I've tried different configurations of start_position => "beginning", sincedb_path => "/dev/null", stat_interval => 1, mode => read, and others, and it seems that Logstash will read the file when it starts, and only that one time, and never again, no matter what, until I restart Logstash.

I've read this page, but I'm not seeing where a specific problem is. Can someone break this down for me? Thank you in advance.

@badger since you've been so helpful so far :slight_smile:

sincedb_path => "/dev/null" does not prevent the file input managing the sincedb. It prevents it persisting that db across restarts.

Why do expect it to get read more than once?

If you enable '--log.level trace' what does filewatch have to say?

Okay, good to know, thank you.

It's not documented anywhere that I've found, but as I understand it, when the file gets to 100MB, it renames with a timestamp in the name, then starts a new latest file. Additionally, once Logstash has read that latest file on first startup, it doesn't read any lines that are added to it, even when it's that same file.

I'm sorry, where do I put this when I'm running Logstash as a service?

You can set the log.level in logstash.yml

If I run the single config via the command line, it seems to read the file as I expect it to; however if I put that config parameter in the yml and restart the service, it will read everything inthe current file once, then stop reading any new messages. In both, I get the following:

[DEBUG] 2019-08-16 13:54:58.339 [pool-3-thread-2] jvm - collector name {:name=>"ParNew"}
[DEBUG] 2019-08-16 13:54:58.339 [pool-3-thread-2] jvm - collector name {:name=>"ConcurrentMarkSweep"}
[TRACE] 2019-08-16 13:54:58.595 [[main]<file] processor - Delayed Delete processing
[TRACE] 2019-08-16 13:54:58.595 [[main]<file] processor - Watched + Active restat processing
[TRACE] 2019-08-16 13:54:58.598 [[main]<file] processor - Rotation In Progress processing
[TRACE] 2019-08-16 13:54:58.598 [[main]<file] processor - Watched processing
[TRACE] 2019-08-16 13:54:58.600 [[main]<file] processor - Active - no change {"watched_file"=>"<FileWatch::WatchedFile: @filename='audit_last.xml', @state='active', @recent_states='[:watched, :watched]', @bytes_read='33244652', @bytes_unread='0', current_size='33244652', last_stat_size='33244652', file_open?='true', @initial=false, @sincedb_key='294976 0 42'>"}

and it looks like the current_size and last_stat_size don't change.

Does it work as expected with a local file?

Even if it did, NetApp CIFS audit logs don't have an option to write to anything that wouldn't be a remote share to logstash. I could try to figure out something to sync the files from the CIFS share to a local filesystem, maybe, if this is determined to not work for me in this method.

FYI, I tried using lsyncd to keep the single file in sync to the local system, but it didn't seem to be able to do that repeatedly. I'm going to try again with the rsync.ssh config, but am not optimistic. Also, when I checked this morning, the running config has been pulling in messages, but only once every hour-ish.

I found this morning if I run watch ls /mnt/cifs_audit/, the config runs fine and ingests as I'd expect it to. Until I can figure out a better solution, I'm running a crontab to ls the directory on a loop, and it seems to be importing.

It seems adding the close_older (we set it to 5) option we were looking for, and doesn't require the cron job or lsyncd.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.