MySQL River?


(Otis Gospodnetić) #1

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html


(Damien Hardy) #2

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like : http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodnetic@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html


(Otis Gospodnetić) #3

Hi,

OK, so this is still external, except it's a push instead of a pull
(like in Solr's DIH).
So no MySQL River.

Is there a technical reason why one could not implement a MySQL
River, or is it just that nobody had this MySQL==>ES itch yet?

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Dec 6, 3:08 am, Damien Hardy damienhardy....@gmail.com wrote:

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like :http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodne...@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html


(David Pilato) #4

In my project (with postgresql), I choose to push documents in ES each time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.

Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using Talend for that.

That's not a river, for sure !

David :wink:
@dadoonet

Le 6 déc. 2011 à 18:51, Otis Gospodnetic otis.gospodnetic@gmail.com a écrit :

Hi,

OK, so this is still external, except it's a push instead of a pull
(like in Solr's DIH).
So no MySQL River.

Is there a technical reason why one could not implement a MySQL
River, or is it just that nobody had this MySQL==>ES itch yet?

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Dec 6, 3:08 am, Damien Hardy damienhardy....@gmail.com wrote:

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like :http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodne...@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html


(Shay Banon) #5

The problem with a generic database river is handling deletes.

On Tue, Dec 6, 2011 at 8:49 PM, David Pilato david@pilato.fr wrote:

In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.

Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using
Talend for that.

That's not a river, for sure !

David :wink:
@dadoonet

Le 6 déc. 2011 à 18:51, Otis Gospodnetic otis.gospodnetic@gmail.com a
écrit :

Hi,

OK, so this is still external, except it's a push instead of a pull
(like in Solr's DIH).
So no MySQL River.

Is there a technical reason why one could not implement a MySQL
River, or is it just that nobody had this MySQL==>ES itch yet?

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Dec 6, 3:08 am, Damien Hardy damienhardy....@gmail.com wrote:

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like :
http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodne...@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html


(Otis Gospodnetić) #6

Hello,

On Dec 6, 2:48 pm, Shay Banon kim...@gmail.com wrote:

The problem with a generic database river is handling deletes.

Solr's DIH relies on DB doing soft deletes. Could an ES RDBMS/MySQL
River not do the same thing?

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Tue, Dec 6, 2011 at 8:49 PM, David Pilato da...@pilato.fr wrote:

In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.

Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of using
Talend for that.

That's not a river, for sure !

David :wink:
@dadoonet

Le 6 déc. 2011 à 18:51, Otis Gospodnetic otis.gospodne...@gmail.com a
écrit :

Hi,

OK, so this is still external, except it's a push instead of a pull
(like in Solr's DIH).
So no MySQL River.

Is there a technical reason why one could not implement a MySQL
River, or is it just that nobody had this MySQL==>ES itch yet?

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html

On Dec 6, 3:08 am, Damien Hardy damienhardy....@gmail.com wrote:

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like :
http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodne...@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement a
tool for indexing database content as a River, but I don't see a MySQL
River.

Or are people writing standalone and external indexer apps that push
data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html


(Shay Banon) #7

By soft deletes you mean marking rows as deleted? Then its really up to the
application you write to work like that. I personally not a fan of "xml
configuration" to invent a (programming) language to query database and
index. If one is needed, one can write something that does that easily and
cron it (or write it as a river). Most times, the indexing process is more
complex than just index a single table into a document, it involves joins
and possibly more criteria that are best expressed (and maintained) in
your favorite lang.

On Wed, Dec 7, 2011 at 7:21 AM, Otis Gospodnetic <otis.gospodnetic@gmail.com

wrote:

Hello,

On Dec 6, 2:48 pm, Shay Banon kim...@gmail.com wrote:

The problem with a generic database river is handling deletes.

Solr's DIH relies on DB doing soft deletes. Could an ES RDBMS/MySQL
River not do the same thing?

Thanks,
Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Tue, Dec 6, 2011 at 8:49 PM, David Pilato da...@pilato.fr wrote:

In my project (with postgresql), I choose to push documents in ES each
time I create, update or delete an Hibernate entity.
That's why I don't use a river by now.

Problem is when I need to reindex all my documents from my database.
At the present time, I create a "batch" for that but I'm thinking of
using

Talend for that.

That's not a river, for sure !

David :wink:
@dadoonet

Le 6 déc. 2011 à 18:51, Otis Gospodnetic otis.gospodne...@gmail.com
a

écrit :

Hi,

OK, so this is still external, except it's a push instead of a pull
(like in Solr's DIH).
So no MySQL River.

Is there a technical reason why one could not implement a MySQL
River, or is it just that nobody had this MySQL==>ES itch yet?

Thanks,
Otis

Sematext is Hiring World-Wide --http://sematext.com/about/jobs.html

On Dec 6, 3:08 am, Damien Hardy damienhardy....@gmail.com wrote:

Hello,

Maybe you can look after MySQL trigger to fire a POST request to
elasticsearch base on UDF like :
http://code.google.com/p/mysql-udf-http/

Cheers,

--
Damien

2011/12/5 Otis Gospodnetic otis.gospodne...@gmail.com

Hello,

How do people do MySQL (or any other RDBMS) indexing in bulk an
incrementally? Is there something like Solr's DataImportHandler in
ES?
I know there are Rivers and I assume it makes sense to implement
a

tool for indexing database content as a River, but I don't see a
MySQL

River.

Or are people writing standalone and external indexer apps that
push

data from DB to ES? (and thus probably creating a SPOF in their
indexing pipeline)

Thanks,
Otis

Sematext is Hiring World-Wide --
http://sematext.com/about/jobs.html


(system) #8