Logstash JDBC document_id not reading latest

bhrtjoshi · January 15, 2018, 11:31pm

Hello There,

I am referring this article https://www.elastic.co/blog/logstash-jdbc-input-plugin to ingest data from Oracle table. I need to get Oracle result set from some huge queries but before I do that I am playing with some small dataset to make sure the config is working.

What I want

Whenever I update an existing row in DB, I also want to update my Elasticsearch document with those updates. Don't want to have a new document created with updated fields.
Whenever I insert new row in DB, I want to get that new row in Elasticsearch.

What's not working

I inserted 3 rows in DB to start with, so I should get 3 documents in Elasticsarch but I am getting only one.
I inserted new row in DB but I cannot see newly inserted DB row in Elasticsearch.

What is working

When I updated the row in DB (the similar row which I have in Elasticsearch), I can see the changes are being reflected.

I am using document_id => "%{uid}" based on above article. I do not have "uid" column in my Oracle table. The example in article does not have uid as a column in DB.

Below is my config which is running every 2 mins.

Could someone please help me to fix this?

Below is my Config

input {
    jdbc {
        type => "temp"
        jdbc_validate_connection => true
        jdbc_connection_string => “my_connection_string”
        jdbc_user => “name”
        jdbc_password => “pwd”
        jdbc_driver_library => "opt/jdbc/lib/ojdbc7.jar"
        jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
        jdbc_default_timezone => 'America/Chicago'
        statement => "SELECT FIRSTNAME, LASTNAME, AGE, DOB, CITY, CREATED_DATE, UPDATED_DATE FROM TEST"
        schedule => "*/2 * * * *"
        }
}
output {
        elasticsearch {
                hosts => ["x.x.x.x:9200"]
                manage_template => true
                index => "<%{type}-{now/d}>"
                document_id => "%{uid}"
                }
}

bhrtjoshi · January 16, 2018, 7:31pm

Hello there, can someone please help and suggest?

bhrtjoshi · January 17, 2018, 5:31pm

@guyboertje thanks for updating the post to show config in readable format. How did you do that?

Also do you know if someone can help me here?

guyboertje · January 17, 2018, 5:38pm

Before you send data to ES I suggest that you experiment with the stdout output instead.

output { stdout { codec => rubydebug } }

You should then see your inserts and updates every two minutes.

However, to true achieve your first ask, you will need

a unique document_id from your DB.
a fingerprint or hashid filter generated field, say fingerprint, that captures the current state of the record.
an elasticsearch filter to query whether the document exists in the target ES index by document_id and add the existing fingerprint to a field existing_fingerprint to the current event.
Filter logic to drop the event if the fingerprint and existing_fingerprint are equal (you got this record already)
Filter logic to add a metadata field called action set to update when existing_fingerprint field is present and it is not equal to fingerprint - with an else clause that adds a metadata field called action set to insert
Use action => "%{[@metadata][action]} in your elasticsearch output.

What you really need and we don't have (but are gathering info about) is Change Data Capture. Each DB tech does CDC differently - so its not something we can shoehorn into the JDBC input.

guyboertje · January 17, 2018, 5:42pm

Formatting change is to enclose the code in Markdown triple backticks

system · February 14, 2018, 5:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash jdbc document_id => "%{uid}" problem Logstash	12	21191	July 6, 2017
Logstash- JDBC plugin Logstash	2	601	July 6, 2017
Logstash and databse Logstash	16	5637	July 6, 2017
Elasticsearch - Logstash data update issue Logstash	2	367	October 9, 2019
Logstash document_id for elasticsearch not incrementing Logstash	3	755	July 6, 2017

Logstash JDBC document_id not reading latest

Related topics