Distributing DB input across multiple Logstash instances

ankh · June 14, 2016, 10:57pm

Very much a n00b here.

We're experimenting with an ELK install to index XML stored in an RDBMS. We're performing a fair amount of parsing on it to define certain fields of interest to be displayed in Kibana.

With a single LS instance, despite running with multiple workers, we expect it will struggle to keep up with the incoming data, and so would like to install multiple LS instances.

However, I'm unsure how the incoming data can be distributed among the instances, preferably with a single config file (rather than customising it for each instance). Each row of data currently has a sequential identifier, so I thought the input query for an instance could, for example, apply a modulo function to the id of newly arrived data to determine which rows that instance will process. But how could this be done with a single config file? I suppose I could somehow incorporate the host name or some other instance-specific variable into the algorithm, but is there a simpler way to distribute data across multiple instances vying for the same input source?

Having said this, I also considered that this may be wrong way to go about parallelising the LS operations, as it would mean (say) three instances are competing for the DB. Another option would be the reverse of the last diagram in https://www.elastic.co/guide/en/logstash/current/deploying-and-scaling.html - that is, to have a single LS shipper instance route the data from the DB into multiple queues, one for each LS indexer instance.

Am I on the right track?

Thanks!

magnusbaeck · June 19, 2016, 10:21am

You'll have to use different queries for different LS instances. Perhaps you can set environment variables that you reference in the queries (should work as of LS 2.3, IIRC)?

Having one LS instance to read the DB and feed a single queue that any number of LS instances can fetch from seems like a better idea.

Topic		Replies	Views
Logstash Multiple File Inputs Logstash	3	20897	July 6, 2017
How to run multiple instances on one machine using the jdbc input plugin Logstash	3	1503	September 19, 2017
How to run multiple logstash instances for s3 input Logstash	7	5107	July 6, 2017
JDBC input filter balancing Logstash	3	208	April 1, 2020
Multiple Logstash Instances on Single Server Logstash	7	12724	July 6, 2017

Distributing DB input across multiple Logstash instances

Related topics