How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
In my project (with postgresql), I choose to push documents in ES each time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.
Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using Talend for that.
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
The problem with a generic database river is handling deletes.
On Tue, Dec 6, 2011 at 8:49 PM, David Pilato david@pilato.fr wrote:
In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.
Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using
Talend for that.
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
On Tue, Dec 6, 2011 at 8:49 PM, David Pilato da...@pilato.fr wrote:
In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.
Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using
Talend for that.
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.
Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
By soft deletes you mean marking rows as deleted? Then its really up to the
application you write to work like that. I personally not a fan of "xml
configuration" to invent a (programming) language to query database and
index. If one is needed, one can write something that does that easily and
cron it (or write it as a river). Most times, the indexing process is more
complex than just index a single table into a document, it involves joins
and possibly more criteria that are best expressed (and maintained) in
your favorite lang.
On Tue, Dec 6, 2011 at 8:49 PM, David Pilato da...@pilato.fr wrote:
In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.
Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of
using
Talend for that.
How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement
a
tool for indexing database content as a River, but I don't see a
MySQL
River.
Or are people writing standalone and external indexer apps that
push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.