Logstash JDBC input plugin

I am using the jdbc plugin to import data form AMAZON redshift to elasticsearch using logstash.

I am processing incremental updates for a very big table which adds around 2 million rows every hour and has a timestamp attached to each row.

I am facing a problem where since data from redshift is not coming in sorted order, in order to process batch update, using :sql_last_value i have to filter latest 2 million row and then sort it which is taking a lot of time.

Is there any work around for this problem so that the sql_last_value stores the max of the current processed batch rather than storing the last value which requires the input to be sorted on that column assigned to sql_last_value ?

I am facing a problem where since data from redshift is not coming in sorted order, in order to process batch update, using :sql_last_value i have to filter latest 2 million row and then sort it which is taking a lot of time.

Can't you let the jdbc input run more often than once an hour so that each batch becomes smaller?

Is there any work around for this problem so that the sql_last_value stores the max of the current processed batch rather than storing the last value

Sorry, I don't understand the difference.

which requires the input to be sorted on that column assigned to sql_last_value ?

If you're only using a timestamp from a column to keep track of what has been processed I don't see how you can possibly avoid sorting the rows before processing them.

For the second part, since the rows returned are not in sorted order, what value does :sql_last_value srore for the timestamp column assigned to it ? Will it be the timestamp of the last processed row (which might not be the latest time stamp because of redshift) or will it store the maximum of the timestamps processed in the current batch ??

Its the timestamp of the last processed row. You need to sort it.

Does Magnus' suggestion of more frequent scheduling not work for you?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.