How to ingest paginated API data into Elasticsearch?

Hi everyone,

I’m trying to ingest data from a web application API into an Elasticsearch index.

Context:

  • The API is HTTP-based and returns JSON data

  • It supports pagination cursor-based

  • I need to regularly fetch all data and index it into Elasticsearch

What I tried:

  • I tried using Logstash with the http_poller input plugin

  • However, it seems that http_poller does not support pagination (or I couldn’t find a way to implement it)

Current workaround:

  • I wrote a Python script that fetches data from the API and pushes it to Elasticsearch

  • It works, but I would prefer a more "ELK-native" solution if possible

My question:

  • Is there a way to handle paginated API ingestion using Logstash or another Elastic tool ?

  • Or is using a custom script (Python) the recommended approach in this case ?

Any feedback or best practices would be appreciated.

Thanks in advance !

Hello and welcome,

The http_poller input cannot paginate, the recommended native way would to use cel input with an Elastic Agent as described here.

Since you will need to learn how to code using CEL, I would say that it is way easier to just use Python, this is what I do in similar cases.