[Filebeat] Bug with multiple aws-cloudwatch logs using log_group_name_prefix

I have a lot of running AWS Lambdas and I'm trying to fetch their logs using Filebeat's aws-cloudwatch input type. I'm using the start_position: beginning in order to backfill the logs when the filebeat starts.

When I specify a single log everything works as expected, but when I use log_group_name_prefix: /aws/lambda, which will match many logs, instead of trying to fetch all logs from the beginning, Filebeat will only try to fetch the first log using startTime = 1970-01-01 00:00:00 +0000 UTC, endTime = 2022-11-23 16:28:17 +0000 UTC" and then try to fetch only the scan_frequency interval for the rest of the logs.
What I expect is that the start_position configuration to be respected for each individual/unique log, but instead it's only applied for the first API call.

My configuration:

    filebeat.inputs:
       - type: aws-cloudwatch
         log_group_name_prefix: /aws/lambda
         region_name: eu-west-1
         scan_frequency: 15m
         start_position: beginning

Debug logs from the running Filebeat:

The first one for the first Lambda /aws/lambda/abn-verification-staging-abn-verification you can see the startTime is 1970-01-01 00:00:00

{"log.level":"debug","@timestamp":"2022-11-23T16:28:16.918Z","log.logger":"registrar","log.origin":{"file.name":"registrar/registrar.go","file.line":140},"message":"Starting Registrar","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-11-23T16:28:16.918Z","log.logger":"input.aws-cloudwatch","log.origin":{"file.name":"compat/compat.go","file.line":113},"message":"Input 'aws-cloudwatch' starting","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:28:17.102Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":167},"message":"start_position = beginning, startTime = 1970-01-01 00:00:00 +0000 UTC, endTime = 2022-11-23 16:28:17 +0000 UTC","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:28:17.102Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":159},"message":"sleeping for 1m0s before checking new logs","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-11-23T16:28:17.102Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":204},"message":"aws-cloudwatch input worker for log group: '/aws/lambda/abn-verification-staging-abn-verification' has started","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:28:17.319Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":87},"message":"sleeping for 200ms before making FilterLogEvents API call again","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:28:17.520Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":89},"message":"done sleeping","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:28:17.520Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":91},"message":"Processing #0 events","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-11-23T16:28:17.520Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":200},"message":"aws-cloudwatch input worker for log group '/aws/lambda/abn-verification-staging-abn-verification' has stopped.","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}

The second log for the next lambda /aws/lambda/aws-slack-notifications with startDate of 2022-11-23 16:28:17 while I expect to see 1970-01-01 00:00:00 :

{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.102Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":161},"message":"done sleeping","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.103Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":167},"message":"start_position = beginning, startTime = 2022-11-23 16:28:17 +0000 UTC, endTime = 2022-11-23 16:29:17 +0000 UTC","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.103Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":159},"message":"sleeping for 1m0s before checking new logs","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-11-23T16:29:17.103Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":204},"message":"aws-cloudwatch input worker for log group: '/aws/lambda/aws-slack-notifications' has started","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.468Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":87},"message":"sleeping for 200ms before making FilterLogEvents API call again","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.668Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":89},"message":"done sleeping","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-11-23T16:29:17.668Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/cloudwatch.go","file.line":91},"message":"Processing #0 events","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-11-23T16:29:17.668Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":200},"message":"aws-cloudwatch input worker for log group '/aws/lambda/aws-slack-notifications' has stopped.","service.name":"filebeat","id":"648125F3E9E24916","ecs.version":"1.6.0"}

Hi @Kaiyan_Sheng, could you please have a look here?

Hello @EDzhelyov , thanks for reaching out! According to AWS API documentation, startTime can only go from Jan 1, 1970 00:00:00 UTC FilterLogEvents - Amazon CloudWatch Logs.

Hi Kaiyan,

I'm unsure how the documentation you linked is related to my issue.

What I would expect to happen when you have a start_position: beginning and multiple group logs matched by the log_group_name_prefix is to fetch all records from the beginning for each log group matches, instead, (at least in 8.4) it will only happen for the first group log.

There is another issue we just found. The scan_frequency option is applied for each individual log group. What I would expect to happen when we have a log_group_name_prefix and a scan_frequency of 15min, is to query each log group at 15 minutes intervals. Instead what we see in the logs is that Filebeat does only one query every 15 minutes.

You can see that by looking at two consecutive logs from the Filebeat process:

{"log.level":"info","@timestamp":"2022-11-29T09:34:28.191Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":200},"message":"aws-cloudwatch input worker for log group '/aws/lambda/rds-slack-notifications' has stopped.","service.name":"filebeat","id":"652FC3D5076F3391","ecs.version":"1.6.0"}

{"log.level":"info","@timestamp":"2022-11-29T09:49:27.308Z","log.logger":"input.aws-cloudwatch.cloudwatch_poller","log.origin":{"file.name":"awscloudwatch/input.go","file.line":204},"message":"aws-cloudwatch input worker for log group: '/aws/lambda/record-unsubscribed-clients-production-record-unsubscribed' has started","service.name":"filebeat","id":"652FC3D5076F3391","ecs.version":"1.6.0"}

The first query for the /aws/lambda/rds-slack-notifications starts at 09:34 and then the next query in the log is for the next log group at 09:49.