Logstash not ingesting when new file added to the directory

I have a logstash configured to ingest S3 access logs , i can successfully ingest data to Elasticsearch when i restart the logstash without any issues. But when new files gets added to the directory from where the Logstash pull the data from, logstash does nothing. I am not sure what's going wrong. The permission and everything is correct. Below is my logstash configuration.

input { file { type => "s3-access-log-new" path => "/home/ubuntu/s3logs/logs/*" start_position => "beginning" } } filter { if [type] == "s3-access-log-new" { grok { match => { "message" => "%{S3_ACCESS_LOG}" } } date { locale => "en" match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } } } output { elasticsearch { host => ["elk-prod-02-data01"] port => "9200" protocol => "http" index => niraj-log-s3-new-%{+YYYY.MM.dd}" } stdout { codec => rubydebug } }

What am i doing wrong?

Niraj

Is there a reason you aren't using the s3 input plugin?

Is there a reason i should use it in replacement for file based logging? I mean i am not a logstash expert though but i haven't tried that. Will using the s3 plugin fix my problem?

Not necessarily. Just curious how you are syncing or pulling in new files to this directory.

What version of logstash are you using?

So it goes like this.

S3cmd tool used to sync files using a cron. The sync is continuous on an interval of 30 mins.

LS Version:- 1.5.3
ES Version:- 1.5.1

You should upgrade to at least LS 1.5.6 - the last release of the 1.5 series. It uses the ruby filewatch 0.6.7 library for the file input. filewatch 0.6.4 had a bug which lost file tracking info.

And I just looked, the 1.0.0 logstash-input-s3 plugin does not use filewatch library, rather it downloads and processes directly (and has an option to save a copy of the file). This should work around the issue that you are seeing, and provide the same level of functionality.

Let me try that and see if that works. I believe LS version is not tagged to a specific ES version, like other plugins do.

I just installed this version , but still the behavior seems to be the same.

What does tree /home/ubuntu/s3logs/logs/ or ls -la /home/ubuntu/s3logs/logs/ look like?

Something like below:

-rwxrwxr-x 1 ubuntu ubuntu 381 Jul 25 19:43 2016-07-25-19-43-56-C970EE64265A8BC4
-rwxrwxr-x 1 ubuntu ubuntu 382 Jul 25 19:44 2016-07-25-19-44-18-651834568545399D
-rwxrwxr-x 1 ubuntu ubuntu 387 Jul 25 19:46 2016-07-25-19-46-19-CB7B8C1AD2B33C72
-rwxrwxr-x 1 ubuntu ubuntu 306 Jul 25 19:46 2016-07-25-19-46-21-1B1792E2EBD80D00
-rwxrwxr-x 1 ubuntu ubuntu 612 Jul 25 19:47 2016-07-25-19-47-30-68807DA9BA824DD8
-rwxrwxr-x 1 ubuntu ubuntu 383 Jul 25 19:48 2016-07-25-19-48-12-5A8DADC21D9EF463
-rwxrwxr-x 1 ubuntu ubuntu 306 Jul 25 19:48 2016-07-25-19-48-17-BF00BFFFB90B48F1
-rwxrwxr-x 1 ubuntu ubuntu 880 Jul 25 19:48 2016-07-25-19-48-58-FB66AE7E7FA4D696
-rwxrwxr-x 1 ubuntu ubuntu 306 Jul 25 19:50 2016-07-25-19-50-39-B25B166C8EA87D8D
-rwxrwxr-x 1 ubuntu ubuntu 382 Jul 25 19:52 2016-07-25-19-52-42-B38935F89AF1A485
-rwxrwxr-x 1 ubuntu ubuntu 1881 Jul 25 19:54 2016-07-25-19-54-06-3D24B32CB2E53D46

@jpcarey, Is there a way to find out the the file input plugin i have currently is using the ruby filewatch library. I did update the logstash, but i believe the re-install didn't update the plugin.

bin/plugin list --verbose | grep file

Should be logstash-input-file (1.0.2).

Thanks @jpcarey. I am really scratching my head now :smile: The input plugin is fine as well. I am really not sure what is going wrong.

I would assume it has something to do with how the s3cmd downloads and creates the files. I would guess that it creates some temporary files during download. This would require some in depth troubleshooting to figure out what all is happening, and why logstash does not pick up (or potentially believe that it has already processed the file).

You might try appending an extension for the finished file (if you can do this with s3cmd or control the s3 access log naming pattern). Then, configure logstash to look for *.my_file_type. It should then ignore any temporary files in the directory.

Alternatively, try the s3 input.

I tested this out today on a different box with same setup as my EC2 instance. But on my local workstation it works perfectly fine and detects new files when added to the directory by s3cmd.

I will try the s3 input and see if that takes care of the issue.

Any idea why i see the below error on s3 plugin .

A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::S3 type=>"s3-access-log", bucket=>"niraj-s3-log", prefix=>"logs", region=>"us-east-1", access_key_id=>"xxxxxxxxxxxxxxxxxxxxxxxxx", secret_access_key=>"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", debug=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, use_ssl=>true, delete=>false, interval=>60, temporary_directory=>"/tmp/logstash">
Error: certificate verify failed {:level=>:error}
A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::S3 type=>"s3-access-log", bucket=>"niraj-s3-log", prefix=>"logs", region=>"us-east-1", access_key_id=>"xxxxxxxxxxxxxxxxxxxxxxxxx", secret_access_key=>"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", debug=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, use_ssl=>true, delete=>false, interval=>60, temporary_directory=>"/tmp/logstash">
Error: certificate verify failed {:level=>:error}
^CSIGINT received. Shutting down the pipeline. {:level=>:warn}
Logstash shutdown completed