Filebeat reading logs from S3

Hi,

I'm trying to get the AWS Logs which is stored in the centralised S3 bucket. I configured the SQS to get the file and push it to the Elastic Cloud index.

I'm facing the below problems:

  1. When I see the logs, each line from the log file is storing as a separate doc.
  2. Getting a gzip invalid header error while uploading the WAF logs and CloudTrail

ERROR:
2020-04-01T19:16:31.002+0530|WARN|[s3]|s3/input.go:277|Processing message failed, updating visibility timeout
2020-04-01T19:16:31.011+0530|INFO|[s3]|s3/input.go:282|Message visibility timeout updated to 300
2020-04-01T19:16:31.035+0530|INFO|[s3]|s3/input.go:282|Message visibility timeout updated to 300
2020-04-01T19:16:31.035+0530|ERROR|[s3]|s3/input.go:447|gzip.NewReader failed: gzip: invalid header
2020-04-01T19:16:31.035+0530|ERROR|[s3]|s3/input.go:386|createEventsFromS3Info failed for folder/XXXXXXXXXX/waf_logs/date/filename.gz: gzip.NewReader failed: gzip: invalid header

Hey!

Could you share your configuration please?

Also please have a look at the docs and make sure that you don't miss anything like Permissions etc.

Thanks!

Hi @ChrsMark,

Thanks for your response!

This is my config:

> filebeat.inputs:
> - type: s3
>   queue_url: https://sqs.us-west-2.amazonaws.com/XXXXXXXXXX/sqs-name
>   visibility_timeout: 300s
>   credential_profile_name: default
> cloud.id: "cloudid"
> cloud.auth: "elastic:{password}"

And yes, my AWS profile has admin access.

Thanks!

Thanks!

Could you share a complete log output of Filebeat too? Please run it in debug mode like ./filebeat -e -d "*".

C.

@ChrsMark

Can we setup a call to discuss on this?

Thanks!

@ChrsMark Or can you give me a sample config file to get the log from S3 which contains the logs of cloudtrail, cloudfront, vpc flowlogs, cloudwatch and waf logs?

@ChrsMark Is there any other module available to collect the logs from S3 Bucket?

Thanks!

@Nithya Thanks for creating this issue here.

filebeat.inputs:
 - type: s3
   queue_url: https://sqs.us-west-2.amazonaws.com/XXXXXXXXXX/sqs-name
   visibility_timeout: 300s
   credential_profile_name: default
   expand_event_list_from_field: Records

cloud.id: "cloudid"
cloud.auth: "elastic:{password}"

For Cloudtrail logs, they are in json format so expand_event_list_from_field is needed for decoding json.

Or you can use the cloudtrail fileset directly in Filebeat. You can run ./filebeat modules enable aws and then in modules.d/aws.yml you should see as section for cloudtrail logs.

Hi @Kaiyan_Sheng,

How could I read the files which have the content type of application/octet-stream?

Because I'm streaming the CLoudWatch and WAFLogs using Firehose from multiple accounts to a common S3 bucket and it has the content type application/octet-stream.

And what are all the content-type which FileBeat will accept?

Thanks!

Hi @Kaiyan_Sheng, @ChrsMark,

Can you check the above comment?

Thanks,
Nithya

@Nithya Sorry for the late response! Right now S3 input in Filebeat reads files with bufio.NewReader unless content-type is application/x-gzip, then it uses gzip.NewReader instead. There is no special reader for application/octet-stream yet.

What error message do you see when you try config below?

filebeat.inputs:
 - type: s3
   queue_url: https://sqs.us-west-2.amazonaws.com/XXXXXXXXXX/sqs-name
   visibility_timeout: 300s

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.