Dynamic URL input in Logstash

Hello,

I am trying to insert a URL as an input in Logstash, but the content of the URL is too large for Logstash to take in all at once with http_poller (memory error), so what I can do is add arguments such as "PageNo" (page number) and "PageSize" to my link to divide my big URL into smaller URL that Logstash can handle.

An example of how my link could look like is as follows:
"https://templink.com/events"

Split into 5 URL (depending on the amount of events in the URL) with 20'000 events each:
"https://templink.com/events/?pageSize=20000&pageNo=1"
"https://templink.com/events/?pageSize=20000&pageNo=2"
"https://templink.com/events/?pageSize=20000&pageNo=3"
"https://templink.com/events/?pageSize=20000&pageNo=4"
"https://templink.com/events/?pageSize=20000&pageNo=5"

I would then be able to put each of these links into the http_poller individually and process them in Logstash

My question is then if there would be a way for Logstash to handle an input that is too large to process and be able to create "sub-URLs" with the "pageSize" and "pageNo" and be able to have each new URL as its own input in Logstash without having to do it manually.

I have achieved this using a Python script which splits my URL into multiple URL and feeds them with the help of sockets through Logstash via the TCP input, but I am not satisfied with the results as this creates a new dependency and would rather have everything done in Logstash.

This is how my input in Logstash looks like:

input {
	stdin {}
	http_poller {
		urls => {
			q1 => {
				method => get
				url => "https://templink.com/events"
				headers => {
					Accept => "application/json"
				}
			}
		}
		request_timeout => 60
		schedule => { cron => "* * * * * UTC"}
		#codec => json
		metadata_target => "http_poller_metadata"
	}
	
}

Would it be possible that I've misunderstood or misused http_poller, or would it be possible for me to split my URL into smaller URL and feed them into the input or do I have to consider using the Python script to feed the content via TCP instead?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.