Hi,
I am currently evaluating (and have no further experience so far) the ELK stack.
I am wondering if it would be possible to import data from multiple dynamic created MariaDB databases into Elasticsearch.
Short backgrond, we have a multi-tenant application where for each of our customer the data is kept in a separate database. So whenever a new customer comes, a new database is created. The available customers are managed in a central database.
What I want to do now is, fetch the available Customer (can go into the thousands) on the input and then enrich with data from the customers respective database. The databases of the customers do all look the same.
So what I did try to achieve this was:
input {
jdbc {
jdbc_connection_string => "jdbc:mariadb://127.0.0.1:3306/eg_test?sessionVariables=sql_mode=ANSI_QUOTES"
jdbc_user => "user"
jdbc_password => "secret"
jdbc_driver_library => "/dataservice/mariadb-java-client-2.7.4.jar"
jdbc_driver_class => "org.mariadb.jdbc.Driver"
jdbc_page_size => 200000
jdbc_paging_enabled => true
statement => "SELECT
customer
FROM eg_test.customers"
}
}
filter {
jdbc_streaming {
jdbc_connection_string => "jdbc:mariadb://127.0.0.1:3306/db_%{[customer]}?sessionVariables=sql_mode=ANSI_QUOTES"
jdbc_user => "user"
jdbc_password => "secret"
jdbc_driver_library => "/dataservice/mariadb-java-client-2.7.4.jar"
jdbc_driver_class => "org.mariadb.jdbc.Driver"
statement => "SELECT
data,
COUNT(something) AS cnt
FROM customer_table
WHERE ctime >= UNIX_TIMESTAMP('${START_DATE}')
AND ctime < UNIX_TIMESTAMP('${END_DATE}')
GROUP BY data"
parameters => { "customer" => "[customer]"}
target => "data"
}
}
output {
# Add to the elastic search index
elasticsearch {
"hosts" => "127.0.0.1:9200"
"index" => "customer_data_test"
}
}
Tough this fails as the jdbc_streaming plugin is unable to resolve the field reference "%{[customer]}" in the jdbc_connection_string .
The error I get back is:
DatabaseConnectionError: Java::JavaSql::SQLSyntaxErrorException: Could not connect to address=(host=127.0.0.1)(port=3306)(type=master) : (conn=165) Unknown database 'db_%{[customer]}'>
So is there a possibility to do this?
The other way I could think of is, create the logstash config dynamically with another script and add each database as input. But would this scale with a few thousand customers?
Thanks for any input,
Enrico