Greetings community,
I stuck with development of data processing pipeline where is one step is fetch data from DB by array of ids.
Prerequisites:
- logstash 7.17
- logstash-filter-jdbc_streaming (OOTB)
- jdbc client for maria db 3.3.3
the event in logstash where I put the ids looks like this
{
"ids" => [
[0] "22",
[1] "7",
[2] "6"
],
"@timestamp" => 2024-02-28T16:19:26.279Z,
"str_ids" => "22,7,6",
"tags" => [..]
}
Note! I added str_ids
for experiments.
My filter config is:
filter {
jdbc_streaming {
jdbc_driver_library => "/usr/share/jdbc/mariadb-java-client-3.3.3.jar"
jdbc_driver_class => "org.mariadb.jdbc.Driver"
jdbc_validate_connection => true
jdbc_validation_timeout => 300
jdbc_connection_string => "jdbc:mariadb://database:3306/MY_DB?autoReconnect=true"
jdbc_user => "user"
jdbc_password => "password"
statement => "
select
id,
name,
description
from MY_TABLE
where
id in (:ids)
"
parameters => {
ids => "[ids]"
}
target => "result"
}
}
The problems found and actions I tried to self them:
- this origin set up gives me an error
[WARN ][logstash.pluginmixins.jdbcstreaming.normalstatementhandler][test][1346a67166141ca9d3ccad4459ff19a23511cbe24f256dbb5e99c81628523d36] Exception when executing JDBC query {:statement=>"..", :parameters=>{:ids=>["6", "7", "22"]}, :exception=>#<Sequel::DatabaseError: Java::JavaSql::SQLException: (conn=269) Illegal parameter data types int and row for operation '='>}
I have no idea why ids are sorted but any way.
- then, I tried to do with
%{}
changed params definition to
parameters => {
ids => "%{[ids]}"
}
..and it starts give me results.
BUT! It applies only 1st id from list (in my case it is "7")
- then, I tried to do it with string representation of list.
my params looks like this
parameters => {
ids => "[str_ids]"
}
the result is the same as previous - only 1st id is applied
- I was pretty upset and decided to check it static query works well
changed statement like this
statement => "
select
id,
name,
description
from MY_TABLE
where
id in (7,22,6)
"
The result is what I expected - all ids were applied.
- I have tried to use placeholder right in the statement instead of parameters option
statement => "
select
id,
name,
description
from MY_TABLE
where
id in (%{[str_ids]})
"
buy it seems did not go via string processing so I get next error
{
:exception=>#<Sequel::DatabaseError: Java::JavaSql::SQLSyntaxErrorException: (conn=406) You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '%{[str_ids]
}
- I tried 1) 2) 3) step but with prepared statement ON in hope it makes more in params processing
so jdbc config looks like this
use_prepared_statements => true
prepared_statement_name => "my_main_stmt"
statement => "
select
id,
name,
description
from MY_TABLE
where
id in (?)
"
prepared_statement_bind_values => [
"[ids]" // then "%{[ids]}", "[str_ids]" , "%{[str_ids]}"
]
the results:
a)
{
:exception=>java.sql.SQLException: Type org.jruby.RubyArray not supported type
}
b) success, only 1st id was applied
c) success, only 1st id was applied
d) success, only 1st id was applied
HOW TO DO IT WORKS!?
Fom my point of view IN (..)
condition is pretty regular so some where is missmatch between jdbc driver and jdbc streaming plugin.
Link to source logstash-filter-jdbc_streaming/lib/logstash/plugin_mixins/jdbc_streaming/parameter_handler.rb at 2ed23b51c0b9b1a57e39bf672f6f013eae88d5fc · logstash-plugins/logstash-filter-jdbc_streaming · GitHub