Architecture for Elastic Search platform


(Kenny Meyer) #1

I am working on populating a search index powered by Elastic Search from a
database hosted on the server of Team A; but I am uncertain about if Team A
can push data from their servers to our search index as JSON, periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned by
Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers"
(http://www.elasticsearch.org/guide/reference/river/), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some feedback.
:slight_smile:

Kenny %-)


(Shay Banon) #2

You will need to write code that get the data from the database and index
it into elasticsearch. You might want to move it to a river later on, but
its not relevant now.

On Tue, Apr 3, 2012 at 4:42 PM, Kenny Meyer knny.myer@gmail.com wrote:

I am working on populating a search index powered by Elastic Search from a
database hosted on the server of Team A; but I am uncertain about if Team A
can push data from their servers to our search index as JSON, periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned
by Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some
feedback. :slight_smile:

Kenny %-)


(David Pilato) #3

You could also look at the Scrutineer project :
https://github.com/Aconex/scrutineer https://github.com/Aconex/scrutineer
It could help you for that need.

David.

Le 3 avril 2012 à 16:34, Shay Banon kimchy@gmail.com a écrit :

You will need to write code that get the data from the database and index
it into elasticsearch. You might want to move it to a river later on, but
its not relevant now.

On Tue, Apr 3, 2012 at 4:42 PM, Kenny Meyer < knny.myer@gmail.com
mailto:knny.myer@gmail.com > wrote:

index powered by Elastic Search from a database hosted on the server of
Team A; but I am uncertain about if Team A can push data from their servers
to our search index as JSON, periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned by
Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/
http://www.elasticsearch.org/guide/reference/river/ ), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some feedback.
:slight_smile:

Kenny %-)

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Kenny Meyer) #4

I tried to use it, but I can't get the connection with the database running.

On Tuesday, April 3, 2012 11:59:20 AM UTC-3, David Pilato wrote:

You could also look at the Scrutineer project :
https://github.com/Aconex/scrutineer

It could help you for that need.

David.

Le 3 avril 2012 à 16:34, Shay Banon kimchy@gmail.com a écrit :

You will need to write code that get the data from the database and
index it into elasticsearch. You might want to move it to a river later on,
but its not relevant now.

On Tue, Apr 3, 2012 at 4:42 PM, Kenny Meyer < knny.myer@gmail.com > wrote:

I am working on populating a search index powered by Elastic Search from a
database hosted on the server of Team A; but I am uncertain about if Team A
can push data from their servers to our search index as JSON, periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned
by Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/ ), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some
feedback. :slight_smile:

Kenny %-)

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet


(Kenny Meyer) #5

I wrote a small and super-simple PHP indexer, and I am open to other,
existing approaches.

I find it strange that there's not a lot of code out there for indexing
data from a RDBMS. May it be that Elastic Search wasn't made for our needs,
or how shall I explain that?

On Tuesday, April 3, 2012 11:34:48 AM UTC-3, kimchy wrote:

You will need to write code that get the data from the database and index
it into elasticsearch. You might want to move it to a river later on, but
its not relevant now.

On Tue, Apr 3, 2012 at 4:42 PM, Kenny Meyer knny.myer@gmail.com wrote:

I am working on populating a search index powered by Elastic Search from
a database hosted on the server of Team A; but I am uncertain about if Team
A can push data from their servers to our search index as JSON,
periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned
by Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some
feedback. :slight_smile:

Kenny %-)


(Greg Brown) #6

Your PHP code looks like it would work fine as long as you continue to
do a 1 to 1 mapping of RDBMS fields to ES fields. However, I'd highly
recommend looking at the Elastica client library if you are building
any significant amount of PHP indexing/searching: https://github.com/ruflin/Elastica

Great library. Makes the code much more readable/maintainable.

We pull all of our data from RDBMS, but generally do a fair bit of
massaging of the data before putting it into ES. I would expect that
is true of many users and that is the reason why a generic way of
pulling from an RDBMS doesn't exist. In particular we end up adding
new fields to search on, strip HTML in some cases, and take constants
and turn them into names. The way a user wants to search is rarely the
way that you store data in an RDBMS so a 1 to 1 mapping won't always
work.

-Greg

On Apr 4, 6:03 am, Kenny Meyer knny.m...@gmail.com wrote:

I wrote a small and super-simple PHP indexer, and I am open to other,
existing approaches.

https://gist.github.com/2300615

I find it strange that there's not a lot of code out there for indexing
data from a RDBMS. May it be that Elastic Search wasn't made for our needs,
or how shall I explain that?

On Tuesday, April 3, 2012 11:34:48 AM UTC-3, kimchy wrote:

You will need to write code that get the data from the database and index
it into elasticsearch. You might want to move it to a river later on, but
its not relevant now.

On Tue, Apr 3, 2012 at 4:42 PM, Kenny Meyer knny.m...@gmail.com wrote:

I am working on populating a search index powered by Elastic Search from
a database hosted on the server of Team A; but I am uncertain about if Team
A can push data from their servers to our search index as JSON,
periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned
by Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some
feedback. :slight_smile:

Kenny %-)


(Kenny Meyer) #7

Greg, thanks for the advice. I have totally missed your reply somehow.

Do you normalize the data from the RDBMS inside of Elastica?

On Tuesday, April 3, 2012 10:42:53 AM UTC-3, Kenny Meyer wrote:

I am working on populating a search index powered by Elastic Search from a
database hosted on the server of Team A; but I am uncertain about if Team A
can push data from their servers to our search index as JSON, periodically.

To give you more context:

The database itself is running on top of PostgreSQL, and the it is owned
by Team A. The search index is hosted on our servers, and I have full
ownership over it.

According to my research with Google Search, I have found "Rivers" (
http://www.elasticsearch.org/guide/reference/river/), but there does not
seem to be anyone supporting PostgreSQL databases. Why?

Looking forward to your insights on the topic and eventually some
feedback. :slight_smile:

Kenny %-)


(system) #8