Elastic Agent, aws-s3-default-aws-s3-vpcflow keeps failing

digital-thought · September 24, 2023, 11:43pm

I have an AWS environment which ships VPC logs to a S3 bucket. I am using the AWS VPC Log Processing integration within Elastic Agent to process these logs and to monitor for new logs.

It connects to the bucket successfully.

When the agent starts up, you can see the CPU utilisation go high and stay high for considerable periods of time and the memory utilisation within Filebeat will also climb to a very high amount, so it is definitely doing something.

I leave it for a couple of days, but nothing appears in my elasticsearch cluster.

When I look at the logs for the elastic agent in question, I am seeing repeated messages such as:

00:17:22.551
elastic_agent
[elastic_agent][info] Component state changed aws-s3-default (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '5056' exited with code '2'
00:17:22.558
elastic_agent
[elastic_agent][info] Unit state changed aws-s3-default (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '5056' exited with code '2'
00:17:22.558
elastic_agent
[elastic_agent][info] Unit state changed aws-s3-default-aws-s3-vpcflow-97ad6888-efd5-400e-8ca3-0a799ec519d6 (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '5056' exited with code '2'
00:17:23.755
elastic_agent
[elastic_agent][info] Spawned new component aws-s3-default: Starting: spawned pid '4012'
00:17:23.756
elastic_agent
[elastic_agent][info] Spawned new unit aws-s3-default-aws-s3-vpcflow-97ad6888-efd5-400e-8ca3-0a799ec519d6: Starting: spawned pid '4012'
00:17:23.756
elastic_agent
[elastic_agent][info] Spawned new unit aws-s3-default: Starting: spawned pid '4012'
00:17:28.689
elastic_agent
[elastic_agent][info] Component state changed aws-s3-default (STARTING->HEALTHY): Healthy: communicating with pid '4012'

When I check on the agent regularly, I can see different PID for filebeat which would confirm it is starting and restarting.

I have a feeling the issue may be because the S3 bucket in question already has a considerable amount of VPC logs within it, and this is causing filebeat to potentially max out memory utilisation and then fail.

I have tried increase memory but it ends the same way.

I don't want to have to create a new S3 bucket.

Would love any suggestions on how to get these logs and monitor for new ones for the VPC logs.

leandrojmp · September 25, 2023, 1:38am

Are you using it combined with SQS or directly pointing to the bucket?

digital-thought · September 25, 2023, 3:15am

Directly pointing to the bucket - no SQS

leandrojmp · September 25, 2023, 3:39am

Yeah, without the SQS the input will need to get a list of all the objects in the bucket and depending on the number of objects this can be really expensive.

For old logs I would suggest that you set prefix by year and month until you process everything, but for new logs the best option is to use SQS.

digital-thought · September 25, 2023, 3:40am

Thanks for that - looks like the way I will go.

digital-thought · September 25, 2023, 4:51am

It is a shame the prefix does not support wildcards.

system · October 23, 2023, 4:51am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash - dump s3 into aws elasticsearch (vpc) - s3 is hanging forever Logstash	2	411	November 23, 2018
Filebeat AWS Module unable to process Logs from S3 Beats beats-module , filebeat	5	604	July 12, 2021
Blob size with S3 Elasticsearch	9	881	July 6, 2017
Elasticsearch on ec2 tutorial - I can't get the S3 gateway to work Elasticsearch	2	375	July 6, 2017
S3 Input Plugin Not Fully Processing ELB Access Logs Logstash	2	1682	September 26, 2017

Elastic Agent, aws-s3-default-aws-s3-vpcflow keeps failing

Related topics