REST API, http_poller and ruby code

nandrik · April 11, 2019, 3:54pm

I wonder if the following functionality, which I've successfully implemented in the form of a client-side Python script, can be completely replaced by server-side Logstash configuration, using a combination of http_poller input and ruby filter plugins.

This is what my Python script currently does:

Pulls a REST API every x seconds and asks for the latest record
If the record is more than 1 steps ahead, it runs a for-loop and downloads one-by-one all the incremental new records, ensuring that no record is missing
It stores the last_run, i.e. the last record downloaded so that if the script fails, it can start from that point
It can historically load records if I provide it with a starting record and number of records that I want it to download
It creates a log which I then use Filebeat to ship to elastic

The reasons why I used client-side scripting and not server-side Logstash plugins, are the following:

the http_poller input plugin seems to be stateless (see this post from @guyboertje) but I wonder if by using ruby to write to a file you can get around that
there's no easy way to do a first query to ask for the latest record id and then run a loop to retrieve the new records, avoiding the sleep_time. Again maybe in ruby, all this is possible.

On the flip side, there are some advantages if I could do this server-side inLogstash, namely:

A server-side script can be easily used across multiple sources without need for installing and monitoring the script and without client dependencies such as rotating logs
This can become the basis of a custom plug-in, further enriching and enhancing the received data in a cloud-based way.

Wonder if anyone has experiences with this dilemma and whether you've also resorted to client-side scripting or think this is achievable through some advanced understanding and implementation using Logstash functionality and its plugins.

Badger · April 11, 2019, 4:06pm

You cannot do it using an http_poller input. No ruby filters will have executed when the input runs and there is no way to pass state to it.

You might be able to use an http filter. You could use any of the inputs that have a schedule option to create dummy events, then use a combination of ruby and http filters to do the work on that schedule.

It feels a bit like like solving the Towers of Hanoi problem in sendmail.cf. It can be done, and it is interesting to see it work, but that doesn't make it a good idea

nandrik · April 11, 2019, 4:18pm

@Badger, thanks for the reply, I was not aware of the similar functionality of the HTTP Filter Plugin and it does make sense that this can be more easily combined with Ruby filters.

I guess I need to find some examples that use these filters in a combined way to understand and assess the tradeoff in complexity of this approach since my so far experience with ruby inside Logstash has been that it's quite hard to debug and ensure that it's working properly.

And I wonder if at that point you reach the stage of contemplating the development of a custom Logstash plugin.

system · May 9, 2019, 4:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
RESTful APIs and http_poller input plugin Logstash	1	357	March 21, 2019
Logstash server read data from resturl Logstash	8	3133	July 6, 2017
Http_poller run ruby script and use some vars in the header Logstash	3	474	February 15, 2019
Using logstash+http_poller to loop through JSON structure, run 2nd poller Logstash	2	1402	December 25, 2017
Http_poller input only new lines? Logstash	12	2330	February 9, 2018

REST API, http_poller and ruby code

Related topics