Reading mysql database: You have an error in your SQL syntax;

I have a mysql fake database which I would like to read as input to logstash & then send the data to elasticsearch.

The database IP, the password, the port, the database and the table are correct As well as the query:

From the database host

$ mysql -h -u root -p -Bse "USE employees; SELECT * FROM employees;"

From logstash

$ logstash -f mylogstash/config-dir/pipelines.conf --config.reload.automatic

12:49:00.100 [Ruby-0-Thread-18: /usr/share/logstash/vendor/bundle/jruby/1.9/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:283] ERROR logstash.inputs.jdbc - Java::ComMysqlJdbcExceptionsJdbc4::MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ';) ASt1LIMIT 1' at line 1: SELECT count(*) AScountFROM (SELECT * FROM employees;) ASt1LIMIT 1

$ cat pipelines.conf

input {
	jdbc { 
		jdbc_driver_library => "/mylogstash/plugin/mysql-connector-java-5.1.23-bin.jar"
		jdbc_driver_class => "com.mysql.jdbc.Driver"
		jdbc_connection_string => "jdbc:mysql://"
		jdbc_user => "root"
		jdbc_password => "passwd"
		statement => "SELECT * FROM employees;"
		schedule => "* * * * *"
		jdbc_paging_enabled => "true"
		jdbc_page_size => "50000"
filter {

output {

    elasticsearch {
        hosts => [ "elasticsearch:9200" ]
        document_type => "default"
        http_compression => true

The query looks OK.
Any hint what is wrong with the config?

Well, I'd try taking the semicolon off the "statement => ...", but the SELECT COUNT(*) FROM ... looks like something is going to run your query twice, so best hope that that isn't a performance disaster.

1 Like

Thanks @TimWard for your prompt reply, indeed, that worked!

I am new to logstash, why the single request is going to be read twice?
What is the performance risk?

Something somewhere is generating "SELECT COUNT(*) FROM (<your query>)" and is also going to have to run <your query> separately to get the actual results. So < your query> is likely to be run twice (depending on how clever the RDBMS might be at optimising the first query, but given how catastrophic COUNT(*) is on many RDBMSs I wouldn't hold out much hope).

If the employees table isn't very big and/or isn't scanned very often this isn't going to be a problem, but if you're already at the limits of performance you could run into problems unnecessarily early as the table grows.

Do you mean that I need to run multiple specific queries to get the data from the database instead of reading the entire table and get filter specific data?

Sorry, I was just trying to help with your original problem of the syntax error, and observed the generated second query in passing. I don't know anything specific about what the Logstash JDBC input is doing or how to control it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.