Use the same connection for multiple Logstash configurations

Andre_Silva · September 7, 2018, 2:42pm

I'm using Logstash 2.4.1 to load data to Elasticsearch 2.4.6.
I have the following Logstash config:

input {
	jdbc {
		jdbc_connection_string => "jdbc:oracle:thin:@database:1521:db1"
	    jdbc_user => "user"
    	jdbc_password => "password"
		jdbc_driver_library => "ojdbc6-11.2.0.jar"
	    jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
    	parameters => { "id" => 1 }
		statement => "SELECT modify_date, userName from user where id = :id AND modify_date >= :sql_last_value"

	    schedule => "*/1 * * * *"
    	tracking_column => modify_date
	}
}
output {
	elasticsearch { 
	    hosts => ["localhost:9200"]
    	index => "index1"
		document_type  => "USER"
	}
    stdout { codec => rubydebug }
}

So, for each minute, it goes to the database to check if there is new data for Elastic.
It works perfectly, but there is one problem:
We have around 100 clients, and they are all in the same database instance.

That means I have 100 scripts and will have 100 instances of Logstash running, meaning 100 open connections:

nohup ./logstash -f client-1.conf Logstash startup
nohup ./logstash -f client-2.conf Logstash startup
nohup ./logstash -f client-3.conf Logstash startup
nohup ./logstash -f client-4.conf Logstash startup
nohup ./logstash -f client-5.conf Logstash startup
and so on...

This is just bad.

Is there any way I can use the same connection for all my scripts ?
The only difference between all those scripts is the parameter id and the index name, each client will have a diferent id and a different index:

parameters => { "id" => 1 }
index => "index1"

Any ideas ?

elasticforme · September 7, 2018, 6:45pm

I am not expert in this but can you just pull all user,modify_date from table user and that will pull all 100 record in to elasticsearch. then you can check modify date via kibana?

magnusbaeck · September 10, 2018, 6:09am

Just select all rows (i.e. drop the id = :id condition in the query), include the id column in the SELECT clause, and reference the customer id in the output configuration:

index => "index%{id}"

(It's most likely not a great idea to have a separate index for each customer. Make sure you know what you're doing.)

Andre_Silva · September 10, 2018, 10:10am

Hi Magnus.

Yeah, I can do that, I thought about many indexes because I could separate them, making each index smaller and making searches faster.
Why do you think it is not a great idea to use separate index ?

magnusbaeck · September 10, 2018, 10:15am

Indexes have a fixed memory overhead so you'll waste resources if you have too many of them. What gives the best performance depends on a lot of factors and you shouldn't assume that greater separation is necessarily advantageous.

Andre_Silva · September 10, 2018, 10:16am

Makes sense... I'll try that and let you know, thanks!

Andre_Silva · September 13, 2018, 9:10am

Worked like a charm, no performance issues.
Thanks!

system · October 11, 2018, 9:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reuse db connection,driverclass, for jdbc inputs in logstash Logstash	4	386	May 11, 2020
How to create multiple indices from multiple oracle databases? Logstash	5	348	May 2, 2018
Multiple inputs to multiple indices Logstash	4	3024	January 5, 2022
Multiple JDBC Inputs in Single Logstash Config File - Unknown Error Logstash	7	8030	September 19, 2017
Logstash Reading Same Data Logstash	2	241	September 15, 2022

Use the same connection for multiple Logstash configurations

Related topics