Loading incremental data into Elasticsearch from Oracle database


(Ibr_B) #1

Hi all,

I need to Load incremental data into Elasticsearch from Oracle database, after the first load ( via Logstash), i need to load only the data updated.

Could you advise on this please?

Regrads,


(Magnus Bäck) #2

Does the data have a last modified column that you can use to select only the rows that have changed since a certain point in time?


(Ibr_B) #3

Yes we have that column "HistCreationTime". How i can use this column?

Thank you!


(Magnus Bäck) #4

Have you read the State and Predefined parameters section of the jdbc input documentation?


(Ibr_B) #5

I have read about sql_last_value and schedule parameter
the schedule works find, but i don't know how to use sql_last_value! , i already use jdbc_validate_connections, jdbc_user, password, driver in the input of Logstash.
But i can't see last_run_metadata_path, tracking_column_type, tracking_column etc ...
Where can i find them? Maybe there are another jdbc plugin i have to install?

Thank you,


(Magnus Bäck) #6

sql_last_value is the name of a query parameter that you can reference in your SQL query. Logstash will populate that parameter with either the time or the value it processed the last time, so you'd typically use it like in this example: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-tracking_column_type

But i can't see last_run_metadata_path, tracking_column_type, tracking_column etc ...
Where can i find them?

Find what? Their documentation?


(Ibr_B) #7

Find those parameters> For example when i run logstash i have error said "last_run_metadata_path" doesn't exist


(Magnus Bäck) #8

What does your configuration look like? Please copy/paste the exact error message. Also, what version of the logstash-input-jdbc plugin do you have (run logstash-plugin list --version to find out)?


(Ibr_B) #9

The version of Logstash is 5.2.1
The version of logstash-input-jdbc-4.1.3
Below the link of my logstash-input-jdbc:
C:\tmp\logstash-5.2.1\vendor\bundle\jruby\1.9\gems\logstash-input-jdbc-4.1.3\lib\logstash

Here is an example of my configuration:

input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@10.0.22.56:1521/DBNAME"
jdbc_user => "QUOD301PRD"
jdbc_password => "password"
jdbc_driver_library => "C:\tmp\logstash-5.2.1\drivers\ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_paging_enabled => "true"
# jdbc_page_size => "50000"
# schedule => "30 10 * * *"
schedule => "0 6 * * *"
statement => "SELECT * FROM ORDR WHERE ParentOrdID like 'AO%' where id > '13/09/2017 20:50:07'"
use_column_value => true
tracking_column => "id"
tracking_column_type => "numeric"
# clean_run => true
last_run_metadata_path => "/path/.logstash_jdbc_last_run"
}
}

filter {

# Set the timestamp to that of the ASH sample, not current time.

mutate { convert => [ "sample_time" , "string" ]}

date { match => ["sample_time", "ISO8601"]}

}

output {
# stdout { codec => rubydebug }
stdout { codec => json_lines }
elasticsearch {
# hosts => ["localhost:9200"]
index => "ordr"
document_type => "ORDR"
# document_id => "%{id}"
hosts => "localhost"
}
}


(Ibr_B) #10

I don't put the filter it is on comment.

Thank you


(Magnus Bäck) #11

Please copy/paste the exact error message.


(Ibr_B) #12

There the error:

[2017-09-01T14:34:52,536][ERROR][logstash.pipeline ] A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Jdbc jdbc_validate_connection=>true, jdbc_connection_string=>"jdbc:oracle:thin:@10.0.22.56:1521/QDSHIVA1", jdbc_user=>"QUOD301PRD", jdbc_password=>, jdbc_driver_library=>"C:\tmp\logstash-5.2.1\drivers\ojdbc7.jar", jdbc_driver_class=>"Java::oracle.jdbc.driver.OracleDriver", jdbc_paging_enabled=>true, statement=>"SELECT * FROM ORDR WHERE id > '13/09/2017 20:50:07'", use_column_value=>true, tracking_column=>"id", tracking_column_type=>"numeric", clean_run=>false, last_run_metadata_path=>"/tmp/ph/.logstash_jdbc_last_run", id=>"4ddd743880bb523c24bcbcd4e0884c0295badf07-1", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_c10f01b3-cedb-498d-b723-65a6241cc91a", enable_metric=>true, charset=>"UTF-8">, jdbc_page_size=>100000, jdbc_validation_timeout=>3600, jdbc_pool_timeout=>5, sql_log_level=>"info", connection_retry_attempts=>1, connection_retry_attempts_wait_time=>0.5, parameters=>{"sql_last_value"=>0}, record_last_run=>true, lowercase_column_names=>true>
Error: No such file or directory - c:/tmp/ph/.logstash_jdbc_last_run


(Magnus Bäck) #13

That's a completely different problem than what you described earlier. Always show the original error messages.

It's clearly having issues opening c:/tmp/ph/.logstash_jdbc_last_run. Do c:/tmp and c:/tmp/ph exist?


(Ibr_B) #14

The c:\tmp is the repertory contain Logstash, Elasticsearch and kibana.

I think my mistake is about c:/tmp/ph, this one doesn't exist.

I'll try to correct my input confuguration and let you know.

Thank you!


(Ofer Rahat) #15

Have the problem being solved? Otherwise, I'd consider using an old fashion Python/Ruby script on crontab to save the debug time of a cutting edge system...

Ofer


(Magnus Bäck) #16

Have the problem being solved?

As the OP said, C:/tmp/ph didn't exist which would explain why Logstash couldn't open c:/tmp/ph/.logstash_jdbc_last_run.


(Ibr_B) #17

Hi Mag,

Sorry for the late answer, not yet my fisrt question still search how to Load incremental data into Elasticsearch from Oracle database, after the first load ( via Logstash), only the data updated.

As i said, im my oracle database, all my tables have column called histcreationtiontime, this column mention the last update of each record in the table.

I don't know how i can use this column.

Thank you!


(Magnus Bäck) #18

Have you looked at the sql_last_value query parameter and read what is said about that parameter in the documentation?


(system) #19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.