S3 Input Plugin Does Not Delete The Temporary Files

The S3 input plugin does not delete the temporary files that it creates from the downloaded objects from S3 bucket even after processing and indexing it to Elasticsearch. Is there a work around/setting to automate this? This is causing our LS instance's disk space to run out and we don't want to hold that data on our instance since we are already indexing that in Elasticsearch.

The input unconditionally calls FileUtils.remove_entry_secure to delete the temporary file.

The Ruby documentation suggests (not entirely clearly) that there are circumstances when this will not delete the file. Does the user running logstash own the temporary directory and all of the files in it? (Granting write access via group or world permissions will not work.)

Thanks. @Badger

the temporary files are written in /opt/elasticsearch/tmp/logstash directory. The /opt, /opt/elasticsearch and /opt/elasticsearch/tmp are all owned by root user.

Only the directory /opt/elasticsearch/tmp/logstash and all the temporary files within it are owned by the user logstash (which is also the user running the LS process)

total 4308
drwxr-xr-x  2 logstash logstash    4096 Oct 19 13:26 jruby-10885
drwxr-xr-x 10 logstash logstash 4399104 Oct 27 19:36 logstash```

If you have experience reading traces from truss/strace/dtrace then I would suggest enabling debug logging in logstash so that you get a timestamp from

@logger.debug("Downloading remote file", :remote_key => remote_object.key, :local_filename => local_filename)

and then trace the logstash process as it downloads a small file from a test bucket in s3.

In that trace, the timestamp of the debug message will show you where to start and the

::File.open(@sincedb_path, 'w') { |file| file.write(since.to_s) }

should produce a trace message that you know comes after the attempt to delete the temporary file. The input runs in its own thread so that allow further filtering of the trace. Then it would be a question of trying to reconcile the trace with the code for remove_entry_secure to see if you can figure what path it is taking through the function and why it is not deleting the file.

It will not be trivial to follow the code alongside the filtered trace, but that is what I would try.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.