Hi,
I posted an issue on GitHub but I thought I might as well ask for some help here:
- Description:
When logstash is shut down abnormally, the s3 files created in the temporary directory are saved, and automatically uploaded on next startup (when the restore
option is set to true
).
This is usually fine, but when the encoding => "gzip"
option is set, the saved gzip file may become corrupted, and sent as such to the s3.
$ zcat ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz >/dev/null
zcat: ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz: unexpected end of file
zcat: ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz: uncompress failed
The files cannot be retrieved via the s3 input plugin, i.e. the following error is thrown:
Error: Unexpected end of ZLIB input stream
And all records from that file/batch are discarded.
This is problematic since it means we cannot rely on persistent queues to ensure no data is lost.
- Version: 4.1.4
- Operating System: Docker
docker.elastic.co/logstash/logstash:6.3.2
- Options:
encoding => "gzip"
,restore => "true"
- Steps to Reproduce:
- Launch a logstash instance with an input plugin that receives a flow of events.
- Configure s3 output plugin with options
encoding => "gzip"
andrestore => "true"
- When a gzip file appears in
/tmp/logstash
, kill the logstash instance abruptly, e.g. withdocker exec -it logstash kill -KILL 1
if you are under docker. - Inspect the temporary file in
/tmp/logstash
, that will be sent via to the s3 on next startup. It will most likely be corrupted, i.e. :
$ zcat ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz >/dev/null zcat: ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz: unexpected end of file zcat: ls.s3.18e199e7-6bf9-4a82-ad65-5fbc3d34ccce.2018-08-02T14.06.part3.txt.gz: uncompress failed