Unable to manage high load using elastic-agent.yml

ErGeek · May 6, 2025, 8:13pm

Hi All,

We are collecting logs from the ForgeRock API using Elastic Agent, which forwards the logs to Kafka. From there, the logs are processed by Logstash.

We are noticing a delay in log ingestion starting from the Elastic Agent itself, particularly during high load testing scenarios. This delay is visible directly in the Elastic Agent logs. Could anyone please check the below elastic-agent.yml config and suggest on how to regulate this config to manage high loads.

Below is the elastic-agent.yml config -

  logging:
    files:
      keepfiles: 7
      name: elastic-agent
      path: /var/log/elastic-agent/
      permissions: 420
    level: debug
    to_files: true

inputs:
  - id: generic-httpjson-staging
    type: httpjson
    streams:
      - config_version: 2
        data_stream:
          dataset: httpjson.generic
          type: logs
        id: httpjson-httpjson.staging
        interval: 30s
        publisher_pipeline.disable_host: true
        request.method: GET
        request.ssl:
          verification_mode: none
        request.url: "https://forgerock.io/monitoring/logs?source=am-everything,idm-everything"
        request.rate_limit:
          limit: '[[.last_response.header.Get "X-Ratelimit-Limit"]]'
          remaining: '[[.last_response.header.Get "X-Ratelimit-Remaining"]]'
          reset: '[[.last_response.header.Get "X-Ratelimit-Reset"]]'
          early_limit: 5
        request.retry:
          max_attempts: 5
          wait_min: 5s
          wait_max: 30s
        request.tracer:
          filename: /var/log/elastic-agent/http-request-trace-*.ndjson
          maxbackups: 5
        request.transforms:
          - set:
              target: header.x-api-key
              value: ----
          - set:
              target: header.x-api-secret
              value: ----
          - set:
              target: url.params.beginTime
              value: '[[.cursor.last_timestamp]]'
              default: '[[ formatDate (now (parseDuration "-1h")) "2006-01-02T15:04:05-07:00" ]]'
          - set:
              target: url.params.endTime
              value: |-
                [[- $last := (parseDate .cursor.last_timestamp "2006-01-02T15:04:05-07:00") -]]
                [[- $day := (parseDuration "24h") -]]
                [[- $end := 0 -]][[- /* Predeclare $end. */ -]]
                [[- with $last -]]
                  [[- $end = .Add $day -]]
                [[- end -]]
                [[- with $end -]]
                  [[- $recent := (now (parseDuration "-10s")) -]][[- /* Ensure that the API has stabilised the documents' presence. */ -]]
                  [[- if .Before $recent -]]
                    [[- formatDate $end "2006-01-02T15:04:05-07:00" -]]
                  [[- else -]]
                    [[- formatDate $recent "2006-01-02T15:04:05-07:00" -]]
                  [[- end -]]
                [[- end -]]
              default: |-
                [[- $start := (now (parseDuration "-1h")) -]]
                [[- $day := (parseDuration "24h") -]]
                [[- $end := 0 -]][[- /* Predeclare $end. */ -]]
                [[- with $start -]]
                  [[- $end = .Add $day -]]
                [[- end -]]
                [[- with $end -]]
                  [[- $recent := (now (parseDuration "-10s")) -]][[- /* Stabilisation time. */ -]]
                  [[- if .Before $recent -]]
                    [[- formatDate $end "2006-01-02T15:04:05-07:00" -]]
                  [[- else -]]
                    [[- formatDate $recent "2006-01-02T15:04:05-07:00" -]]
                  [[- end -]]
                [[- end -]]
        response.split:
          target: body.result
          ignore_empty_value: true
        response.pagination:
          - set:
              target: url.params.endTime
              value: '[[.last_response.url.params.Get "endTime"]]'
          - set:
              target: url.params.beginTime
              value: '[[.last_response.url.params.Get "beginTime"]]'
          - set:
              target: url.params._pagedResultsCookie
              value: '[[.last_response.body.pagedResultsCookie]]'
              fail_on_template_error: true
        cursor:
          last_timestamp:
            value: '[[.last_response.url.params.Get "endTime"]]'
        tags:
          - staging
        fields:
          environment: "staging"
        processors:
          - fingerprint:
              fields: ["@timestamp", "message"]
              target_field: "fingerprint"
              method: "sha256"
              encoding: "hex"

outputs:
  default:
    hosts:
      - b-1.kafka-----.amazonaws.com:9094
      - b-2.kafka-----.amazonaws.com:9094
      - b-3.kafka-----.amazonaws.com:9094
    producer:
      compression: gzip
    ssl:
      enabled: true
      truststore_location: /etc/pki/tls/certs/kafka.client.truststore.jks
      truststore_password: "----"
    topic: test_app_topic
    type: kafka

leandrojmp · May 6, 2025, 9:33pm

Can you share the logs the shows this? It is not clear if the delay is on the agent getting logs from the API, processing it or sending it to the output.

ErGeek · May 7, 2025, 8:22am

Hi @leandrojmp ,

We are getting the below trace:

"{"log.level":"debug","@timestamp":"2025-05-06T11:50:06.387+0100","message":"HTTP response","transaction.id":"BKS9GQ21SOT1G-97263","http.response.status_code":429,"http.response.body.content":"{"errors":["Rpc Error: Code = ResourceExhausted Desc = Quota Exceeded For Quota Metric 'Read Requests' And Limit 'Read Requests Per Minute Per User' Of Service 'Logging.Googleapis.Com' For Consumer 'Project_number:301743521374'.\nError Details: Name = ErrorInfo Reason = RATE_LIMIT_EXCEEDED Domain = Googleapis.Com Metadata = Map[Consumer:Projects/301743521374 Quota_limit:ReadRequestsPerMinutePerUser Quota_limit_value:60 Quota_location:Global Quota_metric:Logging.Googleapis.Com/Read_requests Quota_unit:1/Min/{Project}/{User} Service:Logging.Googleapis.Com]\nError Details: Name = Help Desc = Request A Higher Quota Limit. Url = Https://Cloud.Google.Com/Docs/Quotas/Help/Request_increase\"]}""

Regards

leandrojmp · May 7, 2025, 12:39pm

This is unrelated to the Elastic Agent, you are receiving a 429 error from the endpoint you are querying, you are being rate limited by the endpoint.

Topic		Replies	Views
Limited to Initial 1000 Logs from API with Elastic Agent Elastic Agent	4	56	February 10, 2025
Optimizing Elastic-Agent: How to Reliably Handle 20,000 EPS and Beyond Elastic Agent	1	207	December 2, 2024
API logs not coming after setting up Elastic Agent , Beats logs are coming Elastic Agent	14	127	December 18, 2024
[ElasticAgent] failed to publish events: 429 Too Many Requests Elastic Agent	4	484	September 12, 2024
Elastic Agent is not getting configured properly Elastic Agent	5	129	December 20, 2024

Unable to manage high load using elastic-agent.yml

Related topics