Logstash in distributed environment


(Arun Prakash) #1

Hi All,

Could any one tell me how logstash can be used in distributed environment

I want to configure logstash in distributed environment and have to run on ETL batch jobs in distributed environment.

Thanks in Advance!


(Magnus Bäck) #2

Please be more specific. What, exactly, does "distributed environment" mean? What's the end goal?


(Arun Prakash) #3

Hi Magnusbaeuk,

Since I have to do ETL load for more than billion records from various databases. So If I run logstash in single node. It will take huge time. So, I planned to run the same conf file in many nodes. I am just asking you how to achieve it?. What are the configurations need in logstash.yml file?. I have to complete this task with lesser time and later the same script to be used for incremental load too.


(Magnus Bäck) #4

There's no mechanism for Logstash instances to talk to each other so you have to figure out a way for them to work independently. A few options come to mind:

  • Use different queries for each instance. Instance 1 only fetches rows whose id ends with 1 or 2, instance 2 only fetches rows whose is ends with 3 or 4, and so on.
  • Use a message broker like Kafka or RabbitMQ as a buffer between some process that pull from the database (may or may not be Logstash) and the Logstash instances that work on the broker's queue.

(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.