Here goes my use case:
Table t1 --- >
id a b
123 somestring somestring
Table t2 --->
id c d e
123 someIntegerCount somebooleanValue someString
select * from t1,t2 where t1.id=t2.id and t2.c > 0 and t2.d = 1;
which gives some rows as:
id a b c
d e
123 someString someString someIntegerCount somebooleanValue
someString
Now, In my use case the values of c and d fields in table t2 keeps changing
frequently. So, I index only those rows for which c field > 0 (as count
keeps changing) and d field = 1 (which means either enabled or disabled).
Now, First time indexing is done without any issues. Problem comes when I
update these two fields and want ES to reindex the documents. Since there
might be some documents for which earlier c field was 0 but now it is
non-zero and similarly d field was 0 earlier but changed to 1. Now, I want
ES ti reflect those changes.(I guess that what mongo river does and I
expect mysql to work, automatic sync). Also, there will be few results
which were coming earlier but not now. How to delete those docs from index ?
How can I accomplish this ?
I have tried to explain my problem in simplest manner by keeping things
simple. Please ask questions if anything is not clear.
Thanks
On Thursday, 7 August 2014 14:25:07 UTC+5:30, Jörg Prante wrote:
Can you give a minimal example of a query with the rows, and what rows are
deleted then, so we can work through the issue?
The fundamental problem is that deleted rows in SQL are no longer
available for creating deletion requests and so they can not be tracked
over time - once they are gone, they are gone. The problem is known as
"stale data". This can be solved either at a bigger scope (by using time
windowed indexes where older indexes can be dropped) or by an extra DB
mechanism to provide the IDs of the deleted docs after they are deleted
(maybe by trigger), so they can be selected by JDBC plugin with a "select
_optype, _id" construction. Note, at a certain size, deleting single docs
in ES is not efficient.
To sync data between DB and ES, JDBC plugin is probably not smart enough
(it is impossible to implement app-specific logic in JDBC plugin). So you
should also consider to write a middleware app with specific logic that
controls the deletions in the DB and after that deletes docs in ES.
Jörg
On Thu, Aug 7, 2014 at 10:44 AM, coder <mukul...@gmail.com <javascript:>>
wrote:
Hi,
I'm using jdbc mysql river plugin
GitHub - jprante/elasticsearch-jdbc: JDBC importer for Elasticsearch for creating ES
index. I have been able to index my documents successfully but I'm facing
issues in updating/deleting indexed documents. My jdbc river is used with a
sql query that uses multiple joins on tables and return results. These
results are then indexed in ES. My problem is if I update some tables they
will affect the results of that join query which should be reflected in ES
index but ES index is not updating/deleting sql results from that join
query. I found few threads where people are facing similar issue.
ElasticSearch river JDBC MySQL not deleting records - Stack Overflow
I'm using ES 1.1.0 with jdbc river version 1.1.0.2. There is one more
thread from author itself where he states that deletions are no longer
supported. He tells two methods of tackling the issue. One is re indexing
itself and second is using some sql queries to update/delete indexed
documents.
Parametrized SQL and Aqnowledge SQL error on new version · Issue #202 · jprante/elasticsearch-jdbc · GitHub
Can anyone please tell me how can I tackle the issue. How can I
update/delete already indexed documents. Can anyone please elaborate on the
second method of updating/deleting indexed documents.
Thanks
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f1179d7d-4232-4444-b0ee-e7ac383a4bfc%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42039e84-86e3-4dff-baac-662174a5c1dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.