A client is providing AWS cloud trail logs via an S3 bucket. The S3 input plugin will read .gz files.
However the problem is that the file, eg "cloudtrail_log_2019-01-22.gz" is actually gzip'd twice, and should really be named cloudtrail_log_2019-01-22.gz.gz"
I have tried the gzip_lines codec but that assumes each line in a file is gzip'd not the whole file.
Has anyone else seen this problem and found a solution? Right now I'm thinking about a pipeline using s3 input which just unzips and outputs to a file on the local drive, but suffixing with a .gz on the inner gzip file. Then a second pipeline which reads from that using file codec on the new .gz file.
Thoughts? Comments?