Getting duplicate records with logstash JDBC plugin


(Praveen Kumar) #1

Hi,

I'm getting duplicate records when i run query for every one minute.

Here is my conf file.

input {
jdbc {

    jdbc_connection_string => "jdbc:oracle:thin:@dbhost:port:SID"
    # The user we wish to execute our statement as
    jdbc_user => "username"
	jdbc_password => "password"
	jdbc_validate_connection => true
	schedule => "* * * * *"		
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "/opt/Oracle/jdbc/ojdbc6.jar"
    # The name of the driver class for Postgresql
    jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
    # our query
    statement => "select SHORT_CODE,SHORT_CODE_LONG_VAL,VALUE_TYPE,VALUE,CREATED_DTTM,IMEI from short_code"
}

}

filter {
date {
locale => "en"
match => [ "time_stamp", "dd-MMM-yy HH:mm:ss.SSSSSSSSS a" ]
timezone => "America/Los_Angeles"
#target => "logdate"
remove_field => "time_stamp"
}
}

output {
elasticsearch {
hosts => ["host name"]
index => "short_code_index"
document_type => "short_code"
document_id => "%{SHORT_CODE}"
}
stdout { codec => rubydebug }
}

Appriciate your help!


(Magnus Bäck) #2

You have two choices:

  • Set the id of the inserted documents (via the elasticsearch output's document_id option) to a fixed value based on some or all of the the event's fields (use the fingerprint filter) so that any existing document is updated.
  • If you have a "last modified" column for each row you can change your query to only return the rows modified since the last run. The jdbc input can keep track of when that was.

(Praveen Kumar) #3

Thanks for your reply.

when i add document_id (document_type => "short_code") in output. It is showing only first record kibana.

Can you please provide any example conf file for document_id if you have any?


(Magnus Bäck) #4

when i add document_id (document_type => "short_code") in output. It is showing only first record kibana.

Yes, because you're giving all documents the id "short_code". If you have a field with that name and you want the document id to be the contents of that field you should use document_id => "%{short_code}".


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.