Logstash repeatedly insert same data from corrupted S3 bucket file

Hi,

I am using logstash to read S3 bucket file and insert to Elasticsearch.
I am facing 2 issues right now.

  1. If the S3 file is empty, the logstash does not delete and move forward rather it fires error.

  2. If the file is corrupt(half readable) then it repeat insert of the same file again and again. It does not delete the S3 file so its read again and again.

Is there any solution to this ? I am strucks.

Regarding (1), what error are you getting? I can't see anything in the LS code that detects an empty file and raises an error.

Regarding (2), if you can't delete the bad S3 file in the bucket, you will have to edit the sincedb file. It stores the string representation of the last-modified time of the last file completely read.

You will need to find the last-modified time of the corrupt file in AWS and change the contents of the sincedb file.

You can find the sincedb file path at $HOME/.sincdb_[some hex characters] unless you specified the sincedb path in the config.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.