Hello All,
I have setup the "Kinesis Firehose stream" which accepts the log data from "kinesis data stream" and stores these logs as backup on S3 bucket in gz format.
For this configuration I have kept the KMS encryption on and its value is set to "aws/s3" which is default one.
Now, I am trying to parse the KMS encrypted logs which are stored on S3 in gz format by Kinesis firehose, but I am not getting the data in decrypted format. I am using following configuration to fetch the data from S3 bucket,
The S3 Input acquires and decrypts the files, handing off chunks of gzipped bytes to the codec, which creates Events. By using the GZip Lines Codec, we are able to decompress the gzip-encoded data into plaintext.
Thanks for the reply. I tried setting the codec to gzip_lines but I am getting following exception,
For input data, we have wired up the cloudwatch logs loggroup to kinesis stream, and this kinesis stream is attached to firehose stream which is storing data on S3.
I was wrong. The gzip_lines codec is a community-provided codec that requires data to be something that responds to read (Like an IO or open File), and by default most inputs pass a string to the codec. In its current state, it doesn't appear to work with most inputs.
The S3 Input is documented to automatically decompress .gz inputs, so it should be able to decompress without a special codec. Can you run the original again with debug-level logging enabled? This can be done with either the log.level: debug setting in your logstash.yml configuration, or with the --log.level debug command-line flag.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.