Filebeat with AWS ELB logs = newbie problems

Hi,
I am relatively "new" to ELK, I have some basics in administration and use, but I don't feel like a specialist in any way. I will briefly describe the problem I am struggling with.

On one of our services (hosted on AWS) we have suspiciously high traffic from several addresses, to facilitate the analysis of historical logs from ELB (elastic load balancer) I decided to "load" them into the ELK cluster. As long as I upload a single file manually, everything is fine, the structure is recognized correctly and I can analyze it in Kibana. Worse, from a few days I have almost 4,000 log files (over 1.5 million logs). So I am trying to add these files automatically.

Here is the first problem when I try to configure Filebeat (8.1.1) to read data directly from the S3 bucket I am getting the error:

Input "aws-s3" failed: query s3 failed to initialize: failed to get AWS region for bucket: request canceled, context canceled

Config for this part:

filebeat.inputs:
- type: aws-s3
  enabled: true
  default_region: eu-central-1
  bucket_arn: arn:aws:s3:::elb-access-logs
  number_of_workers: 5
  bucket_list_interval: 300s
  aws_access_key_id: super_secret_key
  aws_secret_access_key: super_secret_secret
  credential_profile_name: my_work_profile
  expand_event_list_from_field: Records

Keys are correct, aws-cli works fine with these credentials:

❯ aws s3api get-bucket-location --bucket elb-access-logs
{
    "LocationConstraint": "eu-central-1"
}

I was looking for a solution, among others on this forum and neither of them worked for me.

I decided to download the logs to a local disk and load them locally, but it looks like I'm doing something wrong because filebeat literally does nothing: /

- type: filestream
  enabled: true 
  paths:
    - /path_to_my_downloaded_logs/elasticloadbalancing/eu-central-1/*.log

In the second case, the only thing that comes to my mind is that the problem is scattering the logs in the subdirectories with the date eg. ./eu-central-1/2022/03/14/ but I assumed (maybe wrongly) that the crawler should check all subdirectories. If not, how do you enable it?

P.S. Yes, I know, I can connect S3 to SQS and download data from there with ready integration, and I will probably do it in the future, but for now I have a problem with historical logs that are too many

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.