JDBC duplicate records sending to SumoLogic


(William Weerts) #1

Hi, I have been struggling for several days to understand and prevent getting duplicate records sent from JDBC(Oracle) to SumoLogic using the logstash-output-sumologic plugin. I have been searching the forums but all fixes seems to be for elastic and I am not sure are applicable for my scenario. Below is my conf file. You will see I have my DB connection info and my output set to sumologic.

input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@server:9999:dbname"
jdbc_user => ""
jdbc_password => ""
jdbc_driver_library => "C:\logstash\ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
statement => "select t.log_id as LogId,really log sql statement where t.log_id > :sql_last_value"
use_column_value => true
tracking_column => LogId
record_last_run => true
last_run_metadata_path => "C:\logstash\bin\logstash_jdbc_last_run"
schedule=>"*/5 * * * *"
}
}

output {
sumologic{
url=>"https://endpoint1.collection.sumologic.com/blah"
format => "%{@json}"
}

I am just at a loss because the way the SQL is written it should not be sending any duplicates. Could it have anything to do with the multiple date fields returning from sql? I appreciate any help. Thanks.


(William Weerts) #2

I figured out why I was getting duplicates. When I was running the query the tracking column was what I was where I was tracking my place for the query, When I ran the query I noticed that the query did not return data sorted by that column. This is what made the duplicates. Sometimes the the last number was not the last true number so it would pull in data again. To fix all I did was add an order by Logid and now the duplicates are gone.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.