Importing MySQL in ES

(Kofi) #1

Hi, I'm trying to enrich data I have in ES with data I have in a MySQL database. I have a unique id for my ES documents that I can use to reference the data in the SQL db. My ES data has just a uniqueid and time, and the sql database has that same unique id for that event but then about 15 different fields that I'd like to peg onto the ES document.

Currently, I'm pulling the MySql db into my ES instance and then using the elasticsearch filter to peg on those fields. I'm using the jdbc input.

I'd like to know is the jdbc_streaming filter supposed to be used for this use case? I don't understand the documentation fully but if I wanted to attach 15 fields from the db to my ES document, how would it look? I'm having a hard time understanding/distinguishing the ":code", "parameters" and "target" in the example in the documentation. If I have uniqueID in my ES documents, is that the :code that references the SQL db (and thus I type :uniqueID)? If so, then if my fields in the mysql were location, ip, and size, would that go into the parameters or target or both? How does that look? I'm sorry but I'm new to this!!

  jdbc_streaming {
    jdbc_driver_library => "/path/to/mysql-connector-java-5.1.34-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => ""jdbc:mysql://localhost:3306/mydatabase"
    jdbc_user => "me"
    jdbc_password => "secret"
    statement => "select * from WORLD.COUNTRY WHERE Code = :code"
    parameters => { "code" => "country_code"}
    target => "country_details"


(Kofi) #2

Please help!

(Mark Walkom) #3


Basically equates to;

select * from WORLD.COUNTRY WHERE Code = country_code

Because it takes the assigned value and puts it in place of the variable.

I'm not really sure about target though sorry.

(Kofi) #4

@warkolm Do you think the use-case for the jdbc_streaming is what I'm looking to do? Or what is the purpose/general use-case of this filter?

(Mark Walkom) #5

Oh now I get target (it took a minute for my brain to work).
This is a filter plugin, so it expects to be fed data from an input. If there are fields from the database lookup that match the incoming fields, it'll overwrite them by default. Otherwise you can specify to write the lookups values to other fields.

(Kofi) #6

Ah okay, so if I wanted to add 15 fields, I'd probably have to do this filter 15 times (assuming my documents don't have the matching fields). I looked into the git repo and it seems like what I'm asking is coming down the pipeline with a PR 2 days ago. I'll be waiting for that

(Kofi) #7

Thanks though for the insight!!

(Guy Boertje) #8


You can use either enhancement plugin jdbc_streaming or jdbc_static (available to install as a plugin now and will be bundled with LS 6.2.0).

jdbc_streaming is designed for the case where your sql db has new data inserted often or where data is being updated quite frequently - local (in the filter) caching is short lived. For example a Product Catalog or a Warehouse product location mapping. Also, the db call (and caching) does not happen until an event is actually processed.

jdbc_static is designed for the case where you have several Reference data tables that are not updated frequently - local caching can be much longer lived. For example: hardware inventory, application inventory. The loader stage is executed during LS startup and so the local lookup database is "ready" for the first event when it arrives.

Why use the JDBC input alongside the JDBC Streaming or Static filters?

Sometimes, putting a JOIN in the JDBC input statement is not practical because it either slows down the query significantly or yields multiple events.
Using JDBC Streaming or Static in this case allows the original select statement to be simpler and run as fast as possible while the local lookup occurs at full speed too (allowing for the initial cache population delay of course).

(Guy Boertje) #9


Regarding understanding the role that the parameters setting plays, here is the updated parameters docs entry from the jdbc_static docs.

A key/value Hash or dictionary. The key (LHS) is the text that is substituted for in the SQL statement SELECT * FROM sensors WHERE reference = :p1. The value (RHS) is the field name in your event. The plugin reads the value from this field out of the event and substitutes that value into the statement, e.g. parameters ⇒ { "p1" ⇒ "[ref]" }. Quoting is automatic - you do not need to put quotes in the statement. Only use the field interpolation syntax on the RHS if you need to add a prefix/suffix or join two event field values together to build the substitution value. For example, imagine an IOT message that has an id and a location and you have a table of sensors that have a column of id-loc_id, in this case your parameter hash would look like this: parameters ⇒ { "p1" ⇒ "%{[id]}-%{[loc_id]}" }

I hope this explains it well enough.

(Kofi) #10

This is exactly what I needed to know. Thanks so much for your detailed response :slight_smile:

(Guy Boertje) #11

BTW, the parameters setting works the same way in both jdbc_streaming and jdbc_static.

(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.