Jdbc river update issue

Hi Guys,

I'm integrating my ES with jdbc river. And I was puzzled with the update/delete operation.

I'm using the river table, and I'm confused on how to configure the river table data column 'source_sql' while

the 'source_operation' values 'index' or 'delete'.

My configuration of my river table on 'source_operation' and 'source_sql' are something like below:

source_operation source_sql
index select * from orders where quantity=44
delete select * from orders where quantity=55

However, while 'index', it indexed the data again, while 'delete', nothing happens to my ES index..

I tried to changed the 'source_sql' like below, it throws the SQL exception:Can not issue data manipulation statements with executeQuery().

source_operation source_sql
index update orders set quantity=444 where quantity=44
delete delete * from orders where quantity=55

And I also come up with another problem, while the tables' volume is big, more than 5 million rows in the table, while using jdbc river with rivertable = false, i got OOM problem. It seems river polls all the data from the source table and then indexing... Although I tried to set the 'fetchsize', doesn't help. I was wondering how you guys river big table using jdbc river.

I wish someone can give me some help.

Thanks a ton,
Spancer

Hi Spancer,

"SQL exception:Can not issue data manipulation statements with
executeQuery()" looks like a bug.

Chunking the row data that can be selected per river cycle is a nice idea.
'fetchsize' is only a hint for the JDBC cursor cache.

Can you please post your issues
at GitHub - jprante/elasticsearch-jdbc: JDBC importer for Elasticsearch ?

Thanks,

Jörg

On Tuesday, November 13, 2012 9:00:00 AM UTC+1, spancer ray wrote:

Hi Guys, I'm integrating my ES with jdbc river. And I was puzzled with the
update/delete operation. I'm using the river table, and I'm confused on how
to configure the river table data column 'source_sql' while the
'source_operation' values 'index' or 'delete'. My configuration of my river
table on 'source_operation' and 'source_sql' are something like below:
source_operation source_sql index select * from orders where quantity=44
delete select * from orders where quantity=55 However, while 'index', it
indexed the data again, while 'delete', nothing happens to my ES index.. I
tried to changed the 'source_sql' like below, it throws the SQL
exception:Can not issue data manipulation statements with executeQuery().
source_operation source_sql index update orders set quantity=444 where
quantity=44 delete delete * from orders where quantity=55 And I also come
up with another problem, while the tables' volume is big, more than 5
million rows in the table, while using jdbc river with rivertable = false,
i got OOM problem. It seems river polls all the data from the source table
and then indexing... Although I tried to set the 'fetchsize', doesn't help.
I was wondering how you guys river big table using jdbc river. I wish
someone can give me some help. Thanks a ton, Spancer

View this message in context: jdbc river update issuehttp://elasticsearch-users.115913.n3.nabble.com/jdbc-river-update-issue-tp4025365.html
Sent from the Elasticsearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/at Nabble.com.

--