Limited to Initial 1000 Logs from API with Elastic Agent

Hi All,

We have configured our elastic-agent.yml to poll logs every 1 minute. However, the API returns a maximum of 1000 logs per poll, so we’re only receiving the first 1000 logs while the remaining logs are not being published.

Could someone please help update the elastic-agent.yml to implement a tailing mechanism using a scan frequency similar to Filebeat, or suggest another method to continuously tail the logs?

Below is our current elastic-agent.yml configuration for reference.

agent:
  logging:
    files:
      keepfiles: 7
      name: elastic-agent
      path: /var/log/elastic-agent/
      permissions: 420
    level: info
    to_files: true

inputs:
  - id: generic-httpjson-sb2-am
    type: tail
    streams:
      - config_version: 2
        data_stream:
          dataset: httpjson.generic
          type: logs
        id: httpjson-httpjson.sandbox2_am
        interval: 30s
        publisher_pipeline.disable_host: true
        request.method: GET
        request.ssl:
          verification_mode: none
        request.transforms:
          - set:
              target: header.x-api-key
              value: ------
          - set:
              target: header.x-api-secret
              value: ------
        request.url: -----=am-everything
        tags:
          - sandbox2_am
        env: sandbox2_am

  - id: generic-httpjson-sb2-idm
    type: tail
    streams:
      - config_version: 2
        data_stream:
          dataset: httpjson.generic
          type: logs
        id: httpjson-httpjson.sandbox2_idm
        interval: 30s
        publisher_pipeline.disable_host: true
        request.method: GET
        request.ssl:
          verification_mode: none
        request.tracer:
          filename: /var/log/elastic-agent/http-request-trace-idm-*.ndjson
          maxbackups: 5
        request.transforms:
          - set:
              target: header.x-api-key
              value: -----
          - set:
              target: header.x-api-secret
              value: -------
        request.url: ------
        tags:
          - sandbox2_idm
        env: sandbox2_idm

outputs:
  default:
    hosts:
      - b-1.amazonaws.com:9094
      - b-2.amazonaws.com:9094
      - b-3.amazonaws.com:9094
    producer:
      compression: gzip
    ssl:
      enabled: true
      truststore_location: /----/kafka.client.truststore.jks
      truststore_password: changeit
    topic: test_app_topic
    type: kafka

Thanks!

This depends on your API.

If it only returns 1000 logs per polling, it probably has some way to paginate the requests, so you need to paginate to get the other logs.

How you do that depends on how you need to paginate on the API, but the httpjson supports it, you would need to use the response.pagination part on the configuration to correctly paginate on the requests as the example in the documentation.

If you want more examples, you can check how Elastic Agent integrations that uses the httpjson input are doing this on different kinds of APIs here.

1 Like

Hi @leandrojmp ,

Thank you for providing the links. They were helpful as we are attempting to retrieve logs from the ForgeRock API using the configuration found here( integrations/packages/forgerock/data_stream/am_core/agent/stream/httpjson.yml.hbs at 38acb8874a439c584bdb5502eed4191f31efe25d · elastic/integrations . However, after applying this configuration, we are unable to receive any logs in Elastic.

Below is the current Elastic-agent.yml configuration we are using. But after applying this config , we are unable to get the logs through to Elastic at all.

This is the below Elastic-agent.yml that we are using now :

agent:
  logging:
    files:
      keepfiles: 7
      name: elastic-agent
      path: /var/log/elastic-agent/
      permissions: 420
    level: info
    to_files: true

inputs:
  - id: generic-httpjson-sb2-am
    type: httpjson
    streams:
      - config_version: 2
        data_stream:
          dataset: httpjson.generic
          type: logs
        id: httpjson-httpjson.sandbox2_am
        interval: 30s
        publisher_pipeline.disable_host: true
        request.method: GET
        request.url: "https://..forgeblocks.com/monitoring/logs?source=am-everything"
        request.ssl:
          verification_mode: none
        request.rate_limit:
          limit: '[[.last_response.headers.Get "X-Rate-Limit-Limit"]]'
          remaining: '[[.last_response.headers.Get "X-Rate-Limit-Remaining"]]'
          reset: '[[.last_response.headers.Get "X-Rate-Limit-Reset"]]'
        request.transforms:
          - set:
              target: header.x-api-key
              value: "----"
          - set:
              target: header.x-api-secret
              value: "---"
          - set:
              target: url.params.beginTime
              value: '[[.cursor.last_timestamp]]'
              default: '[[ formatDate (now (parseDuration "-30m")) "2006-01-02T15:04:05-07:00" ]]'
          - set:
              target: url.params.endTime
              value: |-
                [[- $last := (parseDate .cursor.last_timestamp "2006-01-02T15:04:05-07:00") -]]
                [[- $recent := (now (parseDuration "-10s")) -]]
                [[- if $last.Before $recent -]]
                  [[ formatDate $recent "2006-01-02T15:04:05-07:00" ]]
                [[- else -]]
                  [[ formatDate $last.Add (parseDuration "1h") "2006-01-02T15:04:05-07:00" ]]
                [[- end -]]
        response.split:
          target: body.result
          ignore_empty_value: true
        response.pagination:
          - set:
              target: url.params.endTime
              value: '[[.last_response.url.params.Get "endTime"]]'
          - set:
              target: url.params.beginTime
              value: '[[.last_response.url.params.Get "beginTime"]]'
          - set:
              target: url.params._pagedResultsCookie
              value: '[[.last_response.body.pagedResultsCookie]]'
              fail_on_template_error: true
        cursor:
          last_timestamp:
            value: '[[.last_response.url.params.Get "endTime"]]'
        tags:
          - sandbox2_am

outputs:
  default:
    hosts:
      - b-1.:9094
      - b-2.:9094
      - b-3.:9094
    producer:
      compression: gzip
    ssl:
      enabled: true
      truststore_location: /../kafka.client.truststore.jks
      truststore_password: "---"
    topic: test_app_topic
    type: kafka

Also, below is pipeline.conf file filter that we are using currently :

 `filter {
  json {
    source => "message"
    target => "parsed_message"
  }

  if [parsed_message][result] {
    split {
      field => "[parsed_message][result]"
    }
    # Prevent overwriting of existing values
    if ![result_timestamp] {
      mutate {
        add_field => { "result_timestamp" => "%{[parsed_message][result][timestamp]}" }
      }
    }
    if ![result_type] {
      mutate {
        add_field => { "result_type" => "%{[parsed_message][result][type]}" }
      }
    }
    if ![result_source] {
      mutate {
        add_field => { "result_source" => "%{[parsed_message][result][source]}" }
      }
    }

    if [parsed_message][result][payload] {
      mutate {
        add_field => { "payload_content" => "%{[parsed_message][result][payload]}" }
      }

      json {
        source => "payload_content"
        target => "fr"
        remove_field => ["payload_content"]
      }

      if [result_source] == "am-core" or [result_source] == "idm-core" {
        fingerprint {
          source => ["[fr][message]", "[fr][timestamp]", "[fr][transactionId]"]
          target => "[@metadata][fingerprint]"
          concatenate_sources => true
          method => "SHA256"
        }
      } else {
        fingerprint {
          source => ["[fr][_id]", "[fr][eventName]"]
          target => "[@metadata][fingerprint]"
          concatenate_sources => true
          method => "SHA256"
        }
      }
    }

    # Handle pagination parameters
    if [parsed_message][pagedResultsCookie] {
      mutate {
        add_field => { "pagination_cursor" => "%{[parsed_message][pagedResultsCookie]}" }
      }
    }

    prune {
      whitelist_names => ["^fr.*$", "^@metadata$", "^fingerprint$", "^@timestamp$", "^tags", "^result_.*$", "^env$", "^parsed_message.*$"]
    }
  }
}

Could you please check if there is any issue with these configs and suggest me any changes with them?

Thanks in advance!

If there is an Elastic Agent Integration, why not use it instead of doing the parse on Logstash?

What is not working? The data collection from Elastic Agent or the parse in Logstash?

It is not clear what is not working.

Hi @leandrojmp ,

Currently, we are not using the Elastic Agent integration, as we are following a legacy approach with the standalone agent. However, we do plan to migrate to the Fleet server and eliminate the Logstash layer in the future.

Additionally, data collection is not functioning properly, as we are only receiving limited logs in the Elastic Agent. "input_source":"https://forgeblocks.com/monitoring/logs?source=idm-everything","message":"error processing response: Get \"https://forgeblocks.com/monitoring/logs?_pagedResultsCookie=eyJfc29ydEtleXMiOm51bGwsIl9wYWdlZFJlc3VsdHNDb29raWUiOiJlcGtIQ3BRSEFmUXVjUGhsQ2Z6NVA5SUtzUmctMFIxZUwxdHBTXzBWQm52OE1nSU9NbmdJb2JDOTUzU2lQdHJFeEJodkZUaUlQd1UyYkpSb0lsbzF6NDA0X3BpOFdOQ2JOdHFwR29YRnFBNkwxSE9GSzRyV2cwZWxmcV9GVXA1cVp4cFEtTEoydUlhZy1ScVRfUHI2dkt1aGxXRWVjYlpFSnB0UXdYY0t0MHppYXNmOHVIQy12TUtXdEZNZlp1Q2g0MncxNC1aZVhDYWw1Qjh0VzNaWGNjYzlQbk8tRmdNbnYzaFBBY2Z5S1VEOFh1VTg2S3ZxMlZ5RVE1dml5Y1BFYVIxOEtxbXllb0FCNkVrXzlKQVN6YmRtUFVkdHZxSE5EQkhLY0VWZ0dMcVN0bkpiempkLVNPNzJaNnFHUjZ5b2UxeUR4eVdGSFpaTk45LTRTVUdYQS1BT0xVNFlZX1JqUnRvSEszZGE0R0tZZjhycnljOUZiUEF3dS0tS1V3T1puc21Zbk1Gb3duQkMxMjRrczB3WVd6U1NQelVlOGl3UG92RWNmVUJkTnhXSk11Ni03OGpYb2k5ZUVHUFg2YU1abW5vN2RyTzBtMnptY3JheVFyQlMySHZQclNJdWIwRHNXR1c5VzRyVy01T054RlVZWWpYbkZkQTFDX2VlNkdwWWtIdEpXNHBkVU5mOHdocHd4aWY0Q3FnQkhqUkVPemxBaExjaEV1aW5vYktma3dYRnBZRWtudVBDb0tyZ0FfV01LNHk3MlEwVUJBOHhndXR1MVM4VjdWWS1FbndFaXhVUWlBMFQwLU1MZVVNVE1CUUViR2gtbFZYRjVTWTBYd2h4ZGdSRTlBNlZKenh1ZjI5SXBXdjU4MUotdXBGYTlkNXROSnFsUWtsVnRVT0JBd3JYdmtmejJIbUN6dE93ZFdEZ1JNdjdnTGpqSTlYRFMyUTBxQnFEdmRqN3FMRXNiUmR1eDlGYUxIX21pRlJpREZxWlNUVGJyeGcxMFg1aVE5NXpmVm53aEtXM3R5LXpPLURWYUZ3eGM0U25Cb0tiSUtxRnBobzNrU2tIbXU3cEdOem1rN0Q5M0VkcE04MGVFMXB2UXl3UzE3X2FoT2RMSndWUnRMNEE4bGFkR1l1RE8xSGhsdnlxclAzT2dUZm5yU3ZVend0RmtsMXpCN05XQW9qZXRZYmR2QW03NmpwS1JmeW9GZFU2cUZuXzl6WWlYOEtyWjJkbExzZnJWNnZPSFNzbDZWTGVYNk15ajlzOGZyUWlGRmJFTjAzbHZrR1BkTzBmbFlNTXNQRDlzQkxSSlF3VXM0aDA4cmtydEg2dTRQSG0xcmxoalZ2dWt0QXNac1ZGWnJhN2phOWNmXzg4WnNYaHJkcHQwc0dSNkZNTGJuX09XNnNzNTdUM2c3WEtDaG5YdVFCMFJQaWFJMkJISkRoMWt0Zm5BYkpWVFJaQlhCUHNDSkU2cFF6TlpUbGI2TGhvRXpBRV8xNnZ0eGRJYW5PRS1SRkw3c1pkTElFc1BjWHRMSUM4YUJLMVd6VEcySGdDei1ibEZtckNvMmRRa2pOd3pHTm9HdDdWM2JxVDE1MmRBR2hMTXhseWlqN1VyeUQ3czMtTWhfZjZNUWlPUUJBQiIsIl9wYWdlZFJlc291cmNlc09mZnNldCI6MTAwMCwiX3BhZ2VTaXplIjoxMDAwLCJhcmdzIjp7ImJlZ2luVGltZSI6WyIyMDI1LTAyLTEwVDE0OjEyOjI4KzAwOjAwIl0sImVuZFRpbWUiOlsiMjAyNS0wMi0xMFQxNDo0MjoyOFoiXSwic291cmNlIjpbImlkbS1ldmVyeXRoaW5nIl19fQ&beginTime=2025-02-10T14%3A12%3A28%2B00%3A00&source=idm-everything\": GET https://forgeblocks.com/monitoring/logs?_pagedResultsCookie=eyJfc29ydEtleXMiOm51bGwsIl9wYWdlZFJlc3VsdHNDb29raWUiOiJlcGtIQ3BRSEFmUXVjUGhsQ2Z6NVA5SUtzUmctMFIxZUwxdHBTXzBWQm52OE1nSU9NbmdJb2JDOTUzU2lQdHJFeEJodkZUaUlQd1UyYkpSb0lsbzF6NDA0X3BpOFdOQ2JOdHFwR29YRnFBNkwxSE9GSzRyV2cwZWxmcV9GVXA1cVp4cFEtTEoydUlhZy1ScVRfUHI2dkt1aGxXRWVjYlpFSnB0UXdYY0t0MHppYXNmOHVIQy12TUtXdEZNZlp1Q2g0MncxNC1aZVhDYWw1Qjh0VzNaWGNjYzlQbk8tRmdNbnYzaFBBY2Z5S1VEOFh1VTg2S3ZxMlZ5RVE1dml5Y1BFYVIxOEtxbXllb0FCNkVrXzlKQVN6YmRtUFVkdHZxSE5EQkhLY0VWZ0dMcVN0bkpiempkLVNPNzJaNnFHUjZ5b2UxeUR4eVdGSFpaTk45LTRTVUdYQS1BT0xVNFlZX1JqUnRvSEszZGE0R0tZZjhycnljOUZiUEF3dS0tS1V3T1puc21Zbk1Gb3duQkMxMjRrczB3WVd6U1NQelVlOGl3UG92RWNmVUJkTnhXSk11Ni03OGpYb2k5ZUVHUFg2YU1abW5vN2RyTzBtMnptY3JheVFyQlMySHZQclNJdWIwRHNXR1c5VzRyVy01T054RlVZWWpYbkZkQTFDX2VlNkdwWWtIdEpXNHBkVU5mOHdocHd4aWY0Q3FnQkhqUkVPemxBaExjaEV1aW5vYktma3dYRnBZRWtudVBDb0tyZ0FfV01LNHk3MlEwVUJBOHhndXR1MVM4VjdWWS1FbndFaXhVUWlBMFQwLU1MZVVNVE1CUUViR2gtbFZYRjVTWTBYd2h4ZGdSRTlBNlZKenh1ZjI5SXBXdjU4MUotdXBGYTlkNXROSnFsUWtsVnRVT0JBd3JYdmtmejJIbUN6dE93ZFdEZ1JNdjdnTGpqSTlYRFMyUTBxQnFEdmRqN3FMRXNiUmR1eDlGYUxIX21pRlJpREZxWlNUVGJyeGcxMFg1aVE5NXpmVm53aEtXM3R5LXpPLURWYUZ3eGM0U25Cb0tiSUtxRnBobzNrU2tIbXU3cEdOem1rN0Q5M0VkcE04MGVFMXB2UXl3UzE3X2FoT2RMSndWUnRMNEE4bGFkR1l1RE8xSGhsdnlxclAzT2dUZm5yU3ZVend0RmtsMXpCN05XQW9qZXRZYmR2QW03NmpwS1JmeW9GZFU2cUZuXzl6WWlYOEtyWjJkbExzZnJWNnZPSFNzbDZWTGVYNk15ajlzOGZyUWlGRmJFTjAzbHZrR1BkTzBmbFlNTXNQRDlzQkxSSlF3VXM0aDA4cmtydEg2dTRQSG0xcmxoalZ2dWt0QXNac1ZGWnJhN2phOWNmXzg4WnNYaHJkcHQwc0dSNkZNTGJuX09XNnNzNTdUM2c3WEtDaG5YdVFCMFJQaWFJMkJISkRoMWt0Zm5BYkpWVFJaQlhCUHNDSkU2cFF6TlpUbGI2TGhvRXpBRV8xNnZ0eGRJYW5PRS1SRkw3c1pkTElFc1BjWHRMSUM4YUJLMVd6VEcySGdDei1ibEZtckNvMmRRa2pOd3pHTm9HdDdWM2JxVDE1MmRBR2hMTXhseWlqN1VyeUQ3czMtTWhfZjZNUWlPUUJBQiIsIl9wYWdlZFJlc291cmNlc09mZnNldCI6MTAwMCwiX3BhZ2VTaXplIjoxMDAwLCJhcmdzIjp7ImJlZ2luVGltZSI6WyIyMDI1LTAyLTEwVDE0OjEyOjI4KzAwOjAwIl0sImVuZFRpbWUiOlsiMjAyNS0wMi0xMFQxNDo0MjoyOFoiXSwic291cmNlIjpbImlkbS1ldmVyeXRoaW5nIl19fQ&beginTime=2025-02-10T14%3A12%3A28%2B00%3A00&source=idm-everything giving up after 6 attempt(s)"}

Regards!