Enhancing Ingestion Rate of Elastic Agent Reading from Azure Event Hub

RajuParipelly · December 18, 2024, 12:55pm

I have successfully deployed an Elastic Agent that interfaces with an Azure Event Hub. The current configuration processes approximately 5 to 10 million logs per hour, with individual log sizes ranging between 5 to 20 KB.

The system utilizes a virtual machine (VM) to run the Elastic Agent and subsequently transmit the logs to Elastic Cloud. However, I've encountered an issue where the agent does not ingest data as quickly as it's generated by the source, resulting in a consistent lag.

Upon reviewing the VM's resources, I observed that the CPU utilization is between 20-40%, and there is ample memory available. Despite these seemingly sufficient resources, the lag persists.

Could you provide guidance on how to enhance the ingestion rate of the Elastic Agent to better match the pace of data generation from the Azure Event Hub?

strawgate · December 18, 2024, 1:53pm

The first thing to try would be to make sure you're running at least 8.12 Agent and to switch that particular agent to using the throughput preset see: Using Elastic Agent Performance Presets in 8.12 | Elastic Blog

Beyond that, would you be able to share your integration configuration for the event hub integration as well as outline for us any additional processors or ingest pipelines you may have configured?

Can you also share a snippet from your agent log so I can see the internal metrics being generated for this input on your agent?

As a general recommendation for pub/sub integrations like event hub, we recommend employing multiple smaller nodes to scale throughout, see our recommendations for AWS s3/SQS which will apply to eventhub as well: Get the most from Elastic Agent with Amazon S3 and SQS | Elastic Blog

leandrojmp · December 18, 2024, 2:21pm

Besides what @strawgate mentioned, you may need to check on Azure side if there is any throttling.

Azure may throttle your requests depending on some factors, like event rate, and event size and the number of TUs you have.

RajuParipelly · December 18, 2024, 2:52pm

Hi @strawgate - We are using Elastic Agent 8.17 version and below are the performance tuning we have applied

bulk_max_size: 4096
worker: 16
queue.mem.events: 131072
queue.mem.flush.min_events: 4096
queue.mem.flush.timeout: 5s
compression_level: 1
idle_connection_timeout: 15s

And also we have an ingest pipeline used to parse the logs (Contains some grok patterns, and json parsing as well)

below are the metrics of the elastic agent

Topic		Replies	Views
AWS VPC Flow Log integration SIEM	1	442	May 4, 2022
Throughput tweaks for Elastic Agent Integrations? Agent integration not able to keep up with volume of events within an Eventhub Elastic Agent	1	150	November 3, 2023
How do I increase the number of logs per second processing by Elastic Agent and Ingest pipeline? Elasticsearch elastic-agent , ingest-pipeline	2	859	July 23, 2022
Number of Elastic Agent requests sent to Azure Monitor API Metrics	1	138	March 27, 2024
AWS Integration - Poor Performance Elastic Agent integrations	1	237	December 29, 2023

Enhancing Ingestion Rate of Elastic Agent Reading from Azure Event Hub

Related topics