I am trying to do just the basic configuration of trying to get some data from our mysql database into elasticsearch in order to do fulltext searches in a web application.
We have the basic install done and in fact have used ELK before with APM to do development metrics and debugging, with this installation.
However, now i'm trying to configure logstash to basically grab one table from our database and I'm having no end of troubles. I've tried to follow the way things are described on the logstash docs and the jdbc input plugin docs but nothing seems to work.
I have the jdbc-mariadb driver and the jdbc mysql ruby gem installed, i'm not getting errors any more, but nothing is happening, seemingly no relevant clues in logs even though I have all the verboseness and logging cranked all the way up.
this is running on a Debian 10 server with latest version of whole stack (7.7.0).
basic logstash config is:
input {
jdbc {
id => "jdbc-members"
jdbc_validate_connection => true
jdbc_validation_timeout => "10"
jdbc_driver_library => "/usr/share/java/mariadb-java-client.jar"
jdbc_driver_class => "org.mariadb.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://db-mirror---censored---.freecycle.org:3306/newsystem"
jdbc_user => "---censored---"
jdbc_password => "---- censored----"
schedule => "0 */15 * * *"
statement => "SELECT user_id, username, last_name, first_name, email from users WHERE user_id > :sql_last_value"
jdbc_fetch_size => 100
last_run_metadata_path => "/usr/share/logstash/logstash_jdbc_last_run"
record_last_run => true
tracking_column => "user_id"
use_column_value => true
sql_log_level => "debug"
connection_retry_attempts => 3
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
id => "members"
index => "members-%{+YYYY.MM}"
}
stdout {
codec => rubydebug {metadata => true}
}
}
The last run id file is not being created or written to, nor are there any permissions errors appearing about that. there's no error about connection to the mysql server but there's no sign of it trying when i netstat
. there's no index showing up in ES/Kibana.
I've been beating my head against this wall for 3 days. I've tried different connectinon validation settings (including no validation), i've tried connecting as different users, i've tried running logstash under systemd and also from command line as root. I've tried with and without a "record last run"... i've tried smaller queries, and different values of the fetch size. nothing seems to work. i must be missing some basic thing. Add to all this the fact that everytime I try something new, it takes 10-30 minutes for logstash to restart and finish doing its configuration thing, it seems.
What is the current best practice? it seems like there's a lot of different ways to try to do this. Do I need to use Beats to connect to the db, rather than directly in logstash? if so, why? Do we need Hadoop? if so, why? what's the best tutorials for this? and why is this stuff so complicated? it's all very frustrating. (and just for reference, I'm not new to configuring and installing complex systems. i've been working in the IT industry for about 25 years. The only technologies i've seen almost as arcane and maddening have been SNMP, LDAP, and Sendmail.)
thanks for any advice you might have.