I have vpc flow logs going to an S3 bucket and an SQS notification for any object creation event.
I actually much preferred the logstash method of polling the bucket, for several reasons - mainly the ability to re-index from source easily, use regex matching of file patterns, and general simplicity. The realtime notifications don't provide me with much benefit and just add complexity. I hope to see polling as an option on filebeat some day.
Nonetheless, I've set it up, and I am using an instance role on my filebeat node for access to SQS and S3.
Here is my aws.yml:
Although quite a bit of data is being read successfully and I can see it in Kibana, I'm getting a lot of errors and warnings. I have no easy way to tell if it's getting everything, but these messages imply that some may be failing and getting dropped. I can't be certain though, because they're very confusing messages.
2020-03-27T01:27:10.019Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:20.019Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:20.019Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:30.020Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:30.020Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:40.020Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:40.020Z ERROR [s3] s3/input.go:204 failed to receive message from SQS: MissingRegion: could not find region configuration
2020-03-27T01:27:40.887Z ERROR [s3] s3/input.go:475 ReadString failed: context deadline exceeded
2020-03-27T01:27:41.531Z ERROR [s3] s3/input.go:475 ReadString failed: context deadline exceeded
2020-03-27T01:27:41.531Z ERROR [s3] s3/input.go:386 createEventsFromS3Info failed for AWSLogs//vpcflowlogs/us-west-2/2020/03/26/_vpcflowlogs_us-west-2_fl-.log.gz: ReadString failed: conte
xt deadline exceeded
2020-03-27T01:27:41.571Z WARN [s3] s3/input.go:277 Processing message failed, updating visibility timeout
2020-03-27T01:27:41.636Z INFO [s3] s3/input.go:282 Message visibility timeout updated to 300
2020-03-27T01:27:41.794Z WARN [s3] s3/input.go:277 Processing message failed, updating visibility timeout
2020-03-27T01:27:41.799Z INFO [s3] s3/input.go:282 Message visibility timeout updated to 300
2020-03-27T01:27:41.993Z WARN [s3] s3/input.go:277 Processing message failed, updating visibility timeout
2020-03-27T01:27:41.997Z INFO [s3] s3/input.go:282 Message visibility timeout updated to 300
Any ideas?
Hi @swisscheese, thank you for letting us know about your preference on polling method! I will create a github issue to track this.
For the error message you see in the log, I believe it's caused by other filesets that are enabled(by default) in aws module. We do have a github issue to fix this: https://github.com/elastic/beats/issues/17256
If you change aws.yml to the config below(assume you are running 7.6 version), you should see a better/cleaner log.
- module: aws
cloudtrail:
enabled: false
elb:
enabled: false
s3access:
enabled: false
vpcflow:
enabled: true
var.queue_url: https://sqs.us-west-2.amazonaws.com//vpcflow
Ahh sorry I missed this error message. Question: seems like this error showed up twice for the same message AWSLogs/699536110035/vpcflowlogs/us-west-2/2020/03/29/_vpcflowlogs_us-west-2_fl-.log.gz in 1 second, which is not cool... Did you by any chance changed the visibility_timeout param?
2020-03-30T18:30:44.613Z ERROR [s3] s3/input.go:386 createEventsFromS3Info failed for AWSLogs/699536110035/vpcflowlogs/us-west-2/2020/03/29/_vpcflowlogs_us-west-2_fl-.log.gz: ReadString failed: context deadline exceeded
2020-03-30T18:30:45.207Z ERROR [s3] s3/input.go:386 createEventsFromS3Info failed for AWSLogs/699536110035/vpcflowlogs/us-west-2/2020/03/29/_vpcflowlogs_us-west-2_fl-.log.gz: ReadString failed: context deadline exceeded
Do you know if this log file eventually get read by Filebeat? Or it always show in the error message repeatedly?
If you could attach the actual file here, that would be great 
Since it's a context deadline exceeded error, could you try increase setting var.api_timeout in aws.yml please?
Hi,
I am facing exactly the same problem as above.
My configuration :
- module: aws
cloudtrail:
enabled: false
elb:
enabled: false
s3access:
enabled: false
vpcflow:
enabled: true
# AWS SQS queue url
var.queue_url: https://sqs.eu-west-3.amazonaws.com/$AWSACCOUNID/$QUEUEName
I did not change any other values.
My error log :
Apr 09 19:35:21 ip-10-133-50-250.******.fr filebeat[30039]: 2020-04-09T19:35:21.769+0200 ERROR [s3] s3/input.go:475 ReadString failed: context deadline exceeded
Apr 09 19:35:21 ip-10-133-50-250.******.fr filebeat[30039]: 2020-04-09T19:35:21.769+0200 ERROR [s3] s3/input.go:386 createEventsFromS3Info failed for AWSLogs/******/vpcflowlogs/eu-west-3/2020/04/09*******_vpcflowlogs_eu-west-3_fl-0629b5eccc8f3d0ad_20200409T0120Z_9b923cb5.log.gz: ReadString failed: context deadline exceeded
Apr 09 19:35:43 ip-10-133-50-250.******.fr filebeat[30039]: 2020-04-09T19:35:43.317+0200 WARN [s3] s3/input.go:277 Processing message failed, updating visibility timeout
Apr 09 19:35:43 ip-10-133-50-250.******.fr filebeat[30039]: 2020-04-09T19:35:43.357+0200 INFO [s3] s3/input.go:282 Message visibility timeout updated to 300
My SQS queue settings :
VPCFlowLogsQueue:
Type: AWS::SQS::Queue
Properties:
DelaySeconds: 0
MaximumMessageSize: 262144
MessageRetentionPeriod: 345600
VisibilityTimeout: 30
SQS queue policy sqs:* for logstash instance role.
Like @swisscheese I have a vpc flow logs data showing up but I cannot be certain that is all of it.
Elasticsearch Version : 7.5.1
Filebeat Version : 7.6.2 (Use default template/dashboard and index)
S3 and SQS Access trough instance profile role on AWS EC2. (s3:* and sqs:* on specified resources).
Any thoughts ?
Regards,
Thomas.
Could you give this config a try with increasing var.api_timeout please?
- module: aws
cloudtrail:
enabled: false
elb:
enabled: false
s3access:
enabled: false
vpcflow:
enabled: true
var.queue_url: https://sqs.eu-west-3.amazonaws.com/$AWSACCOUNID/$QUEUEName
var.api_timeout: 720s
var.visibility_timeout: 300s
Also since the error message is complaining about file AWSLogs/******/vpcflowlogs/eu-west-3/2020/04/09*******_vpcflowlogs_eu-west-3_fl-0629b5eccc8f3d0ad_20200409T0120Z_9b923cb5.log.gz, if you can check in Kibana discover (maybe with a filter) to see if there are events from this file please?
Thanks,
Kaiyan