Tell Logstash to ignore old files in s3 bucket

I'm using the Logstash s3 plugin to ingest logs from a partner's S3 bucket.

Logstash itself runs in a container, so its sincedb is not persistent. When the container re-starts, Logstash restarts and begins ingesting logs from that bucket since the beginning of time.

delete or backup_to_bucket is not really an option because it's not my bucket to manage.

Is there a way I can tell Logstash to ignore anything older than ~1 day?

Can I seed sincedb with a date? e.g. before Logstash starts, can I echo a date in the format 2020-01-31 07:01:33 +0000 (where that date is ~one day ago) to:

/var/lib/logstash/plugins/inputs/s3

What is the syntax for the hash on the name of the sincedb file? e.g. sincedb_ca090558edfcc5759ac626c813a5a2c2. Or I can just make this whatever I want and use sincedb_path.

Any other suggestions? Thanks in advance. -Clark

This thread is about dropping events using the output of an age{} filter. It would still read the bucket, but not reprocess all the data.

That's great, thank you! I have also since implemented that sincedb workaround and believe startup times are much better now. Appreciate the follow up!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.