Finding the right balance between SQL and ES


(Jondow) #1

We currently have a lot of data sitting in numerous tables that
describe the entity that is queried, or paged (like a catalog would
be). The querying part is however very brute force and I felt it could
benefit from the use of a search engine like ES.

The one decision I'm trying to make however is whether to further
extrapolate the existing SQL tables and 'massage' the somewhat raw
format of the data into other tables that would ease the burden on SQL
queries, or whether ES would be completely up to the task itself, of
being used to search, and also to simply page through a list of
entities based on various filtering criteria (so somewhat like a
search... essentially a filter is like a 'saved search' I guess).

But I'm not sure whether to introduce the intermediate SQL tables as
well as to use ES, or whether to simply just use ES and the SQL tables
are there as backup (if you will) of the data, and to go directly to
certain data once ES has produced its results.

I'm not sure if I've explained myself well, but if I have, does
anybody have any insights/advice on this?

Thanks in advance,
Darryl Pentz


(Shay Banon) #2

You explained it well, but the question is quite broad. You will need to do some testing and see if you can do what you want with elasticsearch.
On Wednesday, February 9, 2011 at 4:29 PM, Jondow wrote:

We currently have a lot of data sitting in numerous tables that
describe the entity that is queried, or paged (like a catalog would
be). The querying part is however very brute force and I felt it could
benefit from the use of a search engine like ES.

The one decision I'm trying to make however is whether to further
extrapolate the existing SQL tables and 'massage' the somewhat raw
format of the data into other tables that would ease the burden on SQL
queries, or whether ES would be completely up to the task itself, of
being used to search, and also to simply page through a list of
entities based on various filtering criteria (so somewhat like a
search... essentially a filter is like a 'saved search' I guess).

But I'm not sure whether to introduce the intermediate SQL tables as
well as to use ES, or whether to simply just use ES and the SQL tables
are there as backup (if you will) of the data, and to go directly to
certain data once ES has produced its results.

I'm not sure if I've explained myself well, but if I have, does
anybody have any insights/advice on this?

Thanks in advance,
Darryl Pentz


(Stephane Bastian) #3

Hi Darryl,

We had the same question a while back when indexing a bunch of data that
where inside a sql db.
We basically ended up using elastic search (in fact it was solr at that
time) for pretty much all queries. We store the table/id field in the
ES index so that we get the actual data from the SQL store as usual. The
only difference is that we use it as some sort of key/value store where
the key is the row id returned by ES. This works pretty well for us

Hope this helps.

Stephane Bastian
http://www.presdici.fr

On Wed, 2011-02-09 at 21:27 +0200, Shay Banon wrote:

You explained it well, but the question is quite broad. You will need
to do some testing and see if you can do what you want with
elasticsearch.

On Wednesday, February 9, 2011 at 4:29 PM, Jondow wrote:

We currently have a lot of data sitting in numerous tables that
describe the entity that is queried, or paged (like a catalog would
be). The querying part is however very brute force and I felt it
could
benefit from the use of a search engine like ES.

The one decision I'm trying to make however is whether to further
extrapolate the existing SQL tables and 'massage' the somewhat raw
format of the data into other tables that would ease the burden on
SQL
queries, or whether ES would be completely up to the task itself, of
being used to search, and also to simply page through a list of
entities based on various filtering criteria (so somewhat like a
search... essentially a filter is like a 'saved search' I guess).

But I'm not sure whether to introduce the intermediate SQL tables as
well as to use ES, or whether to simply just use ES and the SQL
tables
are there as backup (if you will) of the data, and to go directly to
certain data once ES has produced its results.

I'm not sure if I've explained myself well, but if I have, does
anybody have any insights/advice on this?

Thanks in advance,
Darryl Pentz


(system) #4