Logstash invalid byte sequence in UTF-8 jdbc input

Hi,
I have a problem with an error invalid byte sequence in UTF-8. Logstash 7.3.0.

conf below:
input{
jdbc{
jdbc_driver_library => "C:\sqljdbc42.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://MYSERVER;integratedsecurity=true;databaseName=Db;"
jdbc_user => "user"
jdbc_password => ""
jdbc_validate_connection => true
jdbc_pool_timeout => 10
jdbc_default_timezone => "Etc/UTC"
connection_retry_attempts => 3
connection_retry_attempts_wait_time => 2
lowercase_column_names => false
statement_filepath => "C:\sql_1.sql"
schedule => "3 * * * * *"
codec => line { charset => "UTF-8" }
}
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "my_pattern"
}
}

An error:
{ 2048 rufus-scheduler intercepted an error:
2048 job:
2048 Rufus::Scheduler::CronJob "3 * * * * *" {}
2048 error:
2048 2048
2048 ArgumentError
2048 invalid byte sequence in UTF-8
2048 org/jruby/RubyString.java:4790:in partition' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:600:inplaceholder_literal_string_sql_append'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/sql.rb:112:in to_s_append' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:1236:inliteral_expression_append'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:89:in literal_append' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:263:inliteral'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:1581:in static_sql' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/sql.rb:236:inselect_sql'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/adapters/utils/emulate_offset_with_row_number.rb:45:in select_sql' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/adapters/shared/mssql.rb:664:inselect_sql'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/sequel-5.22.0/lib/sequel/dataset/actions.rb:152:in each' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/logstash-input-jdbc-4.3.13/lib/logstash/plugin_mixins/jdbc/jdbc.rb:257:inperform_query'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/logstash-input-jdbc-4.3.13/lib/logstash/plugin_mixins/jdbc/jdbc.rb:229:in execute_statement' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/logstash-input-jdbc-4.3.13/lib/logstash/inputs/jdbc.rb:277:inexecute_query'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/logstash-input-jdbc-4.3.13/lib/logstash/inputs/jdbc.rb:258:in block in run' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:indo_call'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:258:in do_trigger' 2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:300:inblock in start_work_thread'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:299:in block in start_work_thread' 2048 org/jruby/RubyKernel.java:1425:inloop'
2048 C:/Elastic/Logstash/7.3.0/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `block in start_work_thread'
2048 tz:
2048 ENV['TZ']:
2048 Time.now: 2019-08-12 17:39:03 +0200
2048 scheduler:
2048 object_id: 2012
2048 opts:
2048 {:max_work_threads=>1}
2048 frequency: 0.3
2048 scheduler_lock: #Rufus::Scheduler::NullLock:0x7116ed8d
2048 trigger_lock: #Rufus::Scheduler::NullLock:0x7ef55089
2048 uptime: 4250.077 (1h10m50s77)
2048 down?: false
2048 threads: 2
2048 thread: #Thread:0x46af4b51
2048 thread_key: rufus_scheduler_2012
2048 work_threads: 1
2048 active: 1
2048 vacant: 0
2048 max_work_threads: 1
2048 mutexes: {}
2048 jobs: 1
2048 at_jobs: 0
2048 in_jobs: 0
2048 every_jobs: 0
2048 interval_jobs: 0
2048 cron_jobs: 1
2048 running_jobs: 1
2048 work_queue: 0
} 2048 .

What is missing?

The jdbc input does not use a codec because the data it is getting from jdbc is already structured.

From the backtrace, I can see that the error occurs when the SQL statement is being built before it get sent to the server for execution, i.e. on the way up not on the way down.

The most likely problem is with the statement_filepath => "C:\sql_1.sql", the actual file is assumed to be UTF-8 - is it in a different encoding?

Yes, it was this problem, thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.