Loading incremental data into Elasticsearch from Oracle database

Newuser · September 27, 2017, 4:14pm

Hi all,

I need to Load incremental data into Elasticsearch from Oracle database, after the first load ( via Logstash), i need to load only the data updated.

Could you advise on this please?

Regrads,

magnusbaeck · September 27, 2017, 8:05pm

Does the data have a last modified column that you can use to select only the rows that have changed since a certain point in time?

Newuser · September 28, 2017, 8:34am

Yes we have that column "HistCreationTime". How i can use this column?

Thank you!

magnusbaeck · September 28, 2017, 10:18am

Have you read the State and Predefined parameters section of the jdbc input documentation?

Newuser · September 28, 2017, 11:59am

I have read about sql_last_value and schedule parameter
the schedule works find, but i don't know how to use sql_last_value! , i already use jdbc_validate_connections, jdbc_user, password, driver in the input of Logstash.
But i can't see last_run_metadata_path, tracking_column_type, tracking_column etc ...
Where can i find them? Maybe there are another jdbc plugin i have to install?

Thank you,

magnusbaeck · September 28, 2017, 12:05pm

sql_last_value is the name of a query parameter that you can reference in your SQL query. Logstash will populate that parameter with either the time or the value it processed the last time, so you'd typically use it like in this example: Jdbc input plugin | Logstash Reference [8.11] | Elastic

But i can't see last_run_metadata_path, tracking_column_type, tracking_column etc ...
Where can i find them?

Find what? Their documentation?

Newuser · September 28, 2017, 12:09pm

Find those parameters> For example when i run logstash i have error said "last_run_metadata_path" doesn't exist

magnusbaeck · September 28, 2017, 1:09pm

What does your configuration look like? Please copy/paste the exact error message. Also, what version of the logstash-input-jdbc plugin do you have (run logstash-plugin list --version to find out)?

Newuser · September 29, 2017, 9:51am

The version of Logstash is 5.2.1
The version of logstash-input-jdbc-4.1.3
Below the link of my logstash-input-jdbc:
C:\tmp\logstash-5.2.1\vendor\bundle\jruby\1.9\gems\logstash-input-jdbc-4.1.3\lib\logstash

Here is an example of my configuration:

input {
jdbc {
jdbc_validate_connection => true
jdbc_connection_string => "jdbc:oracle:thin:@10.0.22.56:1521/DBNAME"
jdbc_user => "QUOD301PRD"
jdbc_password => "password"
jdbc_driver_library => "C:\tmp\logstash-5.2.1\drivers\ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_paging_enabled => "true"
# jdbc_page_size => "50000"
# schedule => "30 10 * * *"
schedule => "0 6 * * *"
statement => "SELECT * FROM ORDR WHERE ParentOrdID like 'AO%' where id > '13/09/2017 20:50:07'"
use_column_value => true
tracking_column => "id"
tracking_column_type => "numeric"
# clean_run => true
last_run_metadata_path => "/path/.logstash_jdbc_last_run"
}
}

filter {

# Set the timestamp to that of the ASH sample, not current time.

mutate { convert => [ "sample_time" , "string" ]}

date { match => ["sample_time", "ISO8601"]}

}

output {
# stdout { codec => rubydebug }
stdout { codec => json_lines }
elasticsearch {
# hosts => ["localhost:9200"]
index => "ordr"
document_type => "ORDR"
# document_id => "%{id}"
hosts => "localhost"
}
}

Newuser · September 29, 2017, 9:52am

I don't put the filter it is on comment.

Thank you

magnusbaeck · September 29, 2017, 10:02am

Please copy/paste the exact error message.

Newuser · September 29, 2017, 10:15am

There the error:

[2017-09-01T14:34:52,536][ERROR][logstash.pipeline ] A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Jdbc jdbc_validate_connection=>true, jdbc_connection_string=>"jdbc:oracle:thin:@10.0.22.56:1521/QDSHIVA1", jdbc_user=>"QUOD301PRD", jdbc_password=>, jdbc_driver_library=>"C:\tmp\logstash-5.2.1\drivers\ojdbc7.jar", jdbc_driver_class=>"Java::oracle.jdbc.driver.OracleDriver", jdbc_paging_enabled=>true, statement=>"SELECT * FROM ORDR WHERE id > '13/09/2017 20:50:07'", use_column_value=>true, tracking_column=>"id", tracking_column_type=>"numeric", clean_run=>false, last_run_metadata_path=>"/tmp/ph/.logstash_jdbc_last_run", id=>"4ddd743880bb523c24bcbcd4e0884c0295badf07-1", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_c10f01b3-cedb-498d-b723-65a6241cc91a", enable_metric=>true, charset=>"UTF-8">, jdbc_page_size=>100000, jdbc_validation_timeout=>3600, jdbc_pool_timeout=>5, sql_log_level=>"info", connection_retry_attempts=>1, connection_retry_attempts_wait_time=>0.5, parameters=>{"sql_last_value"=>0}, record_last_run=>true, lowercase_column_names=>true>
Error: No such file or directory - c:/tmp/ph/.logstash_jdbc_last_run

magnusbaeck · September 29, 2017, 11:19am

That's a completely different problem than what you described earlier. Always show the original error messages.

It's clearly having issues opening c:/tmp/ph/.logstash_jdbc_last_run. Do c:/tmp and c:/tmp/ph exist?

Newuser · September 29, 2017, 11:35am

The c:\tmp is the repertory contain Logstash, Elasticsearch and kibana.

I think my mistake is about c:/tmp/ph, this one doesn't exist.

I'll try to correct my input confuguration and let you know.

Thank you!

leofer · October 7, 2017, 9:30pm

Have the problem being solved? Otherwise, I'd consider using an old fashion Python/Ruby script on crontab to save the debug time of a cutting edge system...

Ofer

magnusbaeck · October 8, 2017, 9:10am

Have the problem being solved?

As the OP said, C:/tmp/ph didn't exist which would explain why Logstash couldn't open c:/tmp/ph/.logstash_jdbc_last_run.

Newuser · October 16, 2017, 11:51am

Hi Mag,

Sorry for the late answer, not yet my fisrt question still search how to Load incremental data into Elasticsearch from Oracle database, after the first load ( via Logstash), only the data updated.

As i said, im my oracle database, all my tables have column called histcreationtiontime, this column mention the last update of each record in the table.

I don't know how i can use this column.

Thank you!

magnusbaeck · October 16, 2017, 11:53am

Have you looked at the sql_last_value query parameter and read what is said about that parameter in the documentation?

system · November 13, 2017, 11:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jdbc input plugin and sql_last_value Logstash	6	1730	September 25, 2018
What configuration is used to add newly added data in old indexes? Logstash	8	830	August 30, 2019
Unable to perform incremental load Logstash	5	1085	July 5, 2018
Logstash Jdbc input not updating last run Logstash	7	7747	February 13, 2018
Incremental data Logstash	4	1273	April 30, 2020

Loading incremental data into Elasticsearch from Oracle database

filter {

mutate { convert => [ "sample_time" , "string" ]}

date { match => ["sample_time", "ISO8601"]}

}

Related topics