Hello everyone,
I'm using logstash jdbc to transform my data in postgresql to Elasticsearch.
Because my sql query is too large, about 1 millions records, i have to use schedule to transform about 5 row each time, and it repeats every second.
I use tracking_column to track ID column.
The problem is that sometime there are two consecutive rows that the gap is bigger than 100 and logstash stops at that point and :sql_last_value too.
For example, i have decision_id in two consecutive row is 14 and 120. Logstash doesn't move on at decision_id = 14 and :sql_last_value stay at 14 because there is no document between 14 < decision_id < 19. Are there anyway to update :sql_last_value or maybe another way ?
Thank alot.
pilo
what happens if you run your sql without where clause?
is it gone a over load database.
elasticsearch should be able to handle many million records in one go.
few other thing I see from your config is that
it is gone a execute this query every second
it is gone a duplicate record if you run this again from other system or after removing last_run_metadata file.
Hello, thank for your reply.
When i run your sql without where clause, i have javaheapsize outofmemory error.
Do you have another approach for this error ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.