I'm running filebeat 7.9.0, sending logs to logstash 7.9.0 before they are handed over to ES 7.9.0. For filbeat, I have the AWS module configured to retrieve elblogs.
It took me a while to get it setup properly, i.e. for the SQS queue, not only ReceiveMessage, but also ChangeMessageVisibility and DeleteMessage seem to be needed. At least with these, no Permission errors in the logs anymore.
Lots of AWS ELB logs now show up in Kibana. However, I still fear, I miss quite a number of logs in Kibana.
In the filebeat logs, I see messages like this:
Sep 24 09:14:43 hostname filebeat: 2020-09-24T09:14:43.409Z#011ERROR#011[s3]#011s3/input.go:487#011readStringAndTrimDelimiter failed: context deadline exceeded Sep 24 09:14:43 hostname filebeat: 2020-09-24T09:14:43.409Z#011ERROR#011[s3]#011s3/input.go:396#011createEventsFromS3Info failed processing file from s3 bucket "<my-bucket>" with name "path/to/filename.log": readStringAndTrimDelimiter failed: context deadline exceeded
and later followed by messages alike:
Sep 24 09:14:52 hostname filebeat: 2020-09-24T09:14:52.898Z#011WARN#011[s3]#011s3/input.go:299#011Half of the set visibilityTimeout passed, visibility timeout needs to be updated Sep 24 09:14:52 hostname filebeat: 2020-09-24T09:14:52.901Z#011INFO#011[s3]#011s3/input.go:306#011Message visibility timeout updated to 500 seconds
Because I found this thread here, https://discuss.elastic.co/t/aws-vpcflow-errors-count-not-find-region-configuration-context-deadline-exceeded/225471
I already bumped the api_timeout to 500. (It would have been nice if the documentation would have mentioned that it's max at half of visibility_timeout and save me some head scratching why filebeat dies on start up, until I figured I have to bump visibility timeout as well...)
so when I download one of these failed files, and run wc -l I get a different number than what I get when I search in kibana:
event.module: aws AND event.dataset : aws.elb AND aws.s3.object.key : "path/to/filename.log"
for a particular example, I get
wc -l 3114 lines, and kibana search like above only returns 2183 hits, where I expected it, should be the same, shouldn't it?
would I need to bump the api_timeout even higher? The node with filebeat runs in AWS same region as the SQS queue.