Read *.json.gz from AWS S3 bucket

Hi All,

I am new to ELK Stack and trying to read data from S3 buckets. The json data is in compressed format and the folder structure in the S3 bucket is like



Folder can have multiple files..

Looking for some code snippet to read these files using Logstash.

Read the file input plugin docs File input plugin | Logstash Reference [7.16] | Elastic
read mode supports gzip file processing but I believe you have to define a gzip codec then in your input.
However, try it without the codec and see if just the read works on it's own. I haven't tried that before.

input {
   file {
       path => ["/var/log/202*/*.json.gz"]
       codec => "gzip_lines"
       mode => "read"

From the S3 input's docs:

Each line from each file generates an event. Files ending in .gz are handled as gzip’ed files.

Since the S3 input is line-oriented, if the contents of your GZIP files are not line-oriented (such as each being a JSON blob representing a single JSON object), you may need to use the multiline codec to buffer all of the lines into a single event, and then a json Filter to parse the contents into a structured object:

input {
  s3 {
    bucket => ""
    access_key_id => "1234"
    secret_access_key => "secret"
    codec => multiline {
      pattern => "." # anything
      what => "previous" # accumulate until EOF
filter {
  json {
    source => "message"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.