Basic sql to ES config

I am trying to do just the basic configuration of trying to get some data from our mysql database into elasticsearch in order to do fulltext searches in a web application.
We have the basic install done and in fact have used ELK before with APM to do development metrics and debugging, with this installation.
However, now i'm trying to configure logstash to basically grab one table from our database and I'm having no end of troubles. I've tried to follow the way things are described on the logstash docs and the jdbc input plugin docs but nothing seems to work.

I have the jdbc-mariadb driver and the jdbc mysql ruby gem installed, i'm not getting errors any more, but nothing is happening, seemingly no relevant clues in logs even though I have all the verboseness and logging cranked all the way up.

this is running on a Debian 10 server with latest version of whole stack (7.7.0).
basic logstash config is:

    input {
      jdbc {
        id => "jdbc-members"
        jdbc_validate_connection => true
        jdbc_validation_timeout => "10"
        jdbc_driver_library => "/usr/share/java/mariadb-java-client.jar"
        jdbc_driver_class => "org.mariadb.jdbc.Driver"
        jdbc_connection_string => "jdbc:mysql://db-mirror---censored---.freecycle.org:3306/newsystem"
         jdbc_user => "---censored---"
         jdbc_password => "---- censored----"
        schedule => "0 */15 * * *"
        statement => "SELECT user_id, username, last_name, first_name, email from users WHERE user_id > :sql_last_value"
        jdbc_fetch_size => 100
        last_run_metadata_path => "/usr/share/logstash/logstash_jdbc_last_run"
        record_last_run => true
        tracking_column => "user_id"
        use_column_value => true
        sql_log_level => "debug"
        connection_retry_attempts => 3
      }
    }


    output {
     elasticsearch {
       hosts => "127.0.0.1:9200"
       id => "members"
       index => "members-%{+YYYY.MM}"
     }
    stdout {
        codec => rubydebug {metadata => true}
      }
    }

The last run id file is not being created or written to, nor are there any permissions errors appearing about that. there's no error about connection to the mysql server but there's no sign of it trying when i netstat. there's no index showing up in ES/Kibana.
I've been beating my head against this wall for 3 days. I've tried different connectinon validation settings (including no validation), i've tried connecting as different users, i've tried running logstash under systemd and also from command line as root. I've tried with and without a "record last run"... i've tried smaller queries, and different values of the fetch size. nothing seems to work. i must be missing some basic thing. Add to all this the fact that everytime I try something new, it takes 10-30 minutes for logstash to restart and finish doing its configuration thing, it seems.

What is the current best practice? it seems like there's a lot of different ways to try to do this. Do I need to use Beats to connect to the db, rather than directly in logstash? if so, why? Do we need Hadoop? if so, why? what's the best tutorials for this? and why is this stuff so complicated? it's all very frustrating. (and just for reference, I'm not new to configuring and installing complex systems. i've been working in the IT industry for about 25 years. The only technologies i've seen almost as arcane and maddening have been SNMP, LDAP, and Sendmail.)

thanks for any advice you might have.

what i usually do is set the output to stdout or log files, see that it actually fetches the data using jdbc with minimum config (i.e. without scheduling and sql_last_value, etc). just plain metadata. also run it from cli to see any error logs that may be thrown. so just a basic

input {
jdbc {
id => "jdbc-members”
jdbc_driver_library => "/usr/share/java/mariadb-java-client.jar"
jdbc_driver_class => "org.mariadb.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://db-mirror---censored---.freecycle.org:3306/newsystem"
jdbc_user => "---censored---"
jdbc_password => "---- censored----“
statement => "SELECT user_id, username, last_name, first_name, email from users limit 5"
jdbc_fetch_size => 100
}
}

output {
stdout {}
}

then kinda build it from there slowly. also you can pass the config.reload.automatic parameter to logstash cli to save the startup time so you can make changes on the fly

ok that helps a bit.
after paring down some config i managed to see some errors about the driver.
It turns out that the mariadb connector that Debian says is what should work (and they've removed the mysql one in Debian 10) is the problem. I had to find a debian package for the mysql connector on the mysql site, here https://dev.mysql.com/downloads/connector/j/
and install that.
This gets me a step closer. I'm seeing the data come in from mysql at least...
thanks!!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.