Jdbc-streaming plugin seems not working


(VittorioP) #1

hi all,

I'm using jdbc filter plugin to enrich data from Postgres DB but I don't see the output log enriched.

here's the configuration I'm using:

input
{
    stdin { }
}

filter {
  jdbc_streaming {
    jdbc_driver_library => "/home/vittorio/Downloads/postgresql-42.1.1.jre6.jar"
    jdbc_driver_class => "org.postgresql.Driver"
    jdbc_connection_string => "jdbc:postgresql://127.0.0.1:5432/postgres"
    jdbc_user => "postgres"
    jdbc_password => "mypass"
    statement => "SELECT * FROM playground WHERE location = 'location'"

    target => "postgres"
  }
}
output {
  stdout{
    codec => rubydebug
  }
}

the output is simply :

{
      "@timestamp" => 2017-05-12T09:15:11.655Z,
        "@version" => "1",
            "host" => "DonVito-A-tono",
    "install_date" => [
        [0] {}
    ],
         "message" => "mymessage",
            "tags" => [
        [0] "_jdbcstreamingdefaultsused"
    ]
}

(Magnus Bäck) #2
statement => "SELECT * FROM playground WHERE location = 'location'"

This looks suspicious. Do you really want to enrich the events with static data? Did you mean to say :location rather than 'location'? The _jdbcstreamingdefaultsused tag indicates that the query returned an empty result set.


(VittorioP) #3

I'm just trying to understand how it works. I want to enrich the log, I didn't really understand the parameters part.

the data sample I have is this:

equip_id | type  | color | location | install_date 
----------+-------+-------+----------+--------------
        2 | prova | green | north    | 2017-05-12

I want to take all of it (or just the location or the color for example) and put in the log.


(Magnus Bäck) #4

So your database table only contains this row? Then just remove the WHERE clause in your query.

If the table contains multiple rows you have to pick which one to use for the enrichment. You can use query parameters to ask queries based on the events passing through Logstash. That's the normal use case for this plugin.


(VittorioP) #5

No, It is not just this row, this is actually the second.


(VittorioP) #6

And because, as you said, my table contains multiple raws, I should do something like :

statement => "SELECT * FROM playground WHERE location = :location"
parameters => {"location" => "north"}

is this the correct usage? If yes, it didn't work..


(Magnus Bäck) #7

No, because you don't have a field named "north". The right hand side of the parameters map is a field name. If you want to look up a static string you can just do this:

SELECT * FROM playground WHERE location = 'north'

(VittorioP) #8

the field named "north" should come from the log that is being processed by logstash?


(Magnus Bäck) #9

I still don't understand exactly what you're trying to do so I don't know how to answer your question.


(VittorioP) #10

this caused an exception:

[2017-05-12T12:18:38,160][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {"exception"=>"Missing Valuefier handling for full class name=org.jruby.RubyObject, simple name=RubyObject",

the exception happened because I removed parameters from the pipeline, but it should be not required...

I'm sorry If didn't explain well, I'm just trying this plugin, enrich the log with some data from my db and then put it somewhere else. That's it what you didn't understand?


(VittorioP) #11

up :slight_smile:


(Guy Boertje) #12

So in your db there is a table called playground that holds records of equipment installed in some playgrounds. I presume.

lets say your log events have a field called equipment_id.
e.g.

{
  "family": "Jetson",
  "ticket": "all day pass",
  "equipment_id": 2,
  "date": "2017-04-12 09:00:00"
}

The parameters setting will map the substitution symbol in your SQL Query to the field name in your event that contains a value that should be used in the where clause for a query to your db table - it should be a unique id or natural key.

If your query is SELECT * FROM playground WHERE equip_id = :a_place_holder then your parameters setting will look like this parameters => {"a_place_holder" => "equipment_id"}. As Magnus said, this maps the substitution symbol to the event field name.

But if your event field is equip_id and you use the same text for the substitution symbol part in your query SELECT * FROM playground WHERE equip_id = :equip_id then the parameters looks like this parameters => {"equip_id" => "equip_id"}

Essentially the parameters setting allows for one to use Logstash Event.sprintf type substitutions. Say your db table has a prefix as part of the key e.g. A-2 or A-42 but your event only has integer values then the parameters will be parameters => {"equip_id" => "A-%{[equip_id]}"}

Hope this helps


(Magnus Bäck) #13

the exception happened because I removed parameters from the pipeline, but it should be not required...

I think the problem is that one of the columns in your select clause has a data type that can't be converted automatically. See https://github.com/logstash-plugins/logstash-input-jdbc/issues/172 for something very similar. Perhaps you can select a subset of the columns or convert the offending value to e.g. a varchar?


(Guy Boertje) #14

@magnusbaeck
Its probably a Date datatype. Logstash Events tries to convert time like objects to a Timestamp internally. I don't mean the Event timestamp, I mean an instance of LogStash::Timestamp object.

Because Date does not have a time element we can't really convert it to a Time like object, because timezones IIRC.

@Vittorio
You should either cast the Date to a varchar and use the Date filter to parse it or not include it in the query e.g. SELECT type, color, location FROM playground WHERE equip_id = :equip_id NOTE you don't need the column equip_id because its in the event already.


(VittorioP) #15

thanks to all :slight_smile:


(Guy Boertje) #16

Does this mean you got it to enhance your events correctly?

Its important because AFAIK this is the first question we have had on jdbc_streaming and I would like others to see clearly that it works.


(VittorioP) #17

yes, I managed to enhance the event correctly. But I changed the input to file instead of stdin.

Using stdin input I still get no result.


(Guy Boertje) #18

stdin should would work - are you pasting a complete line + newline with the same content as a line from the file that you read with the file input?


(VittorioP) #19

With the stdin I simply write vittorio that will be stored in the message field then in the filter jdbc I use message as parameter for the SQL query.

Doing this way I get :

[2017-05-15T11:03:06,603][DEBUG][logstash.pipeline        ] filter received {"event"=>{"@timestamp"=>2017-05-15T09:03:06.593Z, "@version"=>"1", "host"=>"DonVito-A-tono", "message"=>"vittorio"}}
[2017-05-15T11:03:06,607][DEBUG][logstash.filters.jdbcstreaming] Executing JDBC query {:statement=>"SELECT name, surname,age FROM userstate WHERE name = ':a_place_holder'", :parameters=>{:a_place_holder=>"vittorio"}}
[2017-05-15T11:03:06,629][WARN ][logstash.filters.jdbcstreaming] Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::OrgPostgresqlUtil::PSQLException: ERROR: syntax error at or near "vittorio"
  Position: 56>}

(Guy Boertje) #20

You do not need to wrap :a_place_holder in quotes in your query - this is because the plugin knows how to do this for different datatypes in SQL. It knows that "vittorio" is a string so it will surround it with the correct quotes for the db type when it does the substitution.

Simply :statement=>"SELECT name, surname,age FROM userstate WHERE name = :a_place_holder" will be fine.