Structure advice and recomendation


(czar) #1

I am thinking of using elasticsearch to search over a warehouse table to
improve search performance and get google like auto-complete (nice to have
end goal)

It is a product database that for efficiency was thinking of adding a
warehouse table for it as it was previously connecting to a lot of tables
and becoming extremely slow.

So I was thinking of having a blob column here that would have all its
attributes.. ie ingredients, supplier, synonyms etc etc. lots of metadata
attributes.

Then if you do a search by ingredient, or supplier it should return this
product. (contains)

Is this possible with elasticsearch and a good approach. Looking for advice
here.

Thanks in advance

--


(David Pilato) #2

As far as I understand your question, it's a common use case in Elasticsearch.

But, do you think that Elasticsearch will connect to your database (table) and
perform some search on it?
If not, sorry for the following.

What you need to do before searching is indexing.
So, you send to elasticsearch some JSon content with your data, let's say
something like:
{
"ingredients":"",
"supplier":"",
"whateveryouwant":""
}
Elasticsearch will index it.
After that Elasticsearch will be able to find it regardless the field. But,
Elasticsearch won't connect to your database directly.

Sorry again if I answered outside the scope of your question.

David.

Le 28 août 2012 à 09:25, czar s.degenaar@bigpond.com a écrit :

I am thinking of using elasticsearch to search over a warehouse table to
improve search performance and get google like auto-complete (nice to have end
goal)

It is a product database that for efficiency was thinking of adding a
warehouse table for it as it was previously connecting to a lot of tables and
becoming extremely slow.

So I was thinking of having a blob column here that would have all its
attributes.. ie ingredients, supplier, synonyms etc etc. lots of metadata
attributes.

Then if you do a search by ingredient, or supplier it should return this
product. (contains)

Is this possible with elasticsearch and a good approach. Looking for advice
here.

Thanks in advance

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(czar) #3

Thanks for the answer David. Great advice. No I do not expect it to connect
to the database table.

Was just wondering an optimised way of storing all the data. Would it make
sense to have a denormalised warehouse table of the product and all its
attributes. Maybe store all the attrinutes of the product in json format
string already in a blob file so it is easy to create the index from there
and recreate if needed. Once the search returns the product and id, I can
look the database up for any relating info. This is just to make the search
as quick as possible.

Thanks again

On Tuesday, 28 August 2012 16:09:33 UTC+8, David Pilato wrote:

As far as I understand your question, it's a common use case in
Elasticsearch.

But, do you think that Elasticsearch will connect to your database (table)
and perform some search on it?

If not, sorry for the following.

What you need to do before searching is indexing.

So, you send to elasticsearch some JSon content with your data, let's say
something like:

{

"ingredients":"",

"supplier":"",

"whateveryouwant":""

}

Elasticsearch will index it.

After that Elasticsearch will be able to find it regardless the field.
But, Elasticsearch won't connect to your database directly.

Sorry again if I answered outside the scope of your question.

David.

Le 28 août 2012 à 09:25, czar <s.deg...@bigpond.com <javascript:>> a
écrit :

I am thinking of using elasticsearch to search over a warehouse table to
improve search performance and get google like auto-complete (nice to have
end goal)

It is a product database that for efficiency was thinking of adding a
warehouse table for it as it was previously connecting to a lot of tables
and becoming extremely slow.

So I was thinking of having a blob column here that would have all its
attributes.. ie ingredients, supplier, synonyms etc etc. lots of metadata
attributes.

Then if you do a search by ingredient, or supplier it should return this
product. (contains)

Is this possible with elasticsearch and a good approach. Looking for
advice here.

Thanks in advance

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(David Pilato) #4

Ok. So you have two concerns:

  • index fast
  • search fast

Storing in your database as JSon blob could help as reading entities from a
database with many collections or deep tree structure could be really slow. On
my side, depending of entities, I can read my entities with from 5 entities per
second to more than 1.2k entities per second. Elasticsearch index it very fast.

Then, search is fast, really fast. Just give it a try.

What I love with ES is that you get back when you search, IDs but also your JSon
document source directly. You don't have to read your database to display
results.

Not sure I have answered.
David

Le 28 août 2012 à 11:33, czar s.degenaar@bigpond.com a écrit :

Thanks for the answer David. Great advice. No I do not expect it to connect to
the database table.

Was just wondering an optimised way of storing all the data. Would it make
sense to have a denormalised warehouse table of the product and all its
attributes. Maybe store all the attrinutes of the product in json format
string already in a blob file so it is easy to create the index from there and
recreate if needed. Once the search returns the product and id, I can look the
database up for any relating info. This is just to make the search as quick as
possible.

Thanks again

On Tuesday, 28 August 2012 16:09:33 UTC+8, David Pilato wrote:

As far as I understand your question, it's a common use case in
Elasticsearch.

But, do you think that Elasticsearch will connect to your database
(table) and perform some search on it?
If not, sorry for the following.

What you need to do before searching is indexing.
So, you send to elasticsearch some JSon content with your data, let's say
something like:
{
"ingredients":"",
"supplier":"",
"whateveryouwant":""
}
Elasticsearch will index it.
After that Elasticsearch will be able to find it regardless the field.
But, Elasticsearch won't connect to your database directly.

Sorry again if I answered outside the scope of your question.

David.

Le 28 août 2012 à 09:25, czar < s.deg...@bigpond.com> a écrit :

> > > I am thinking of using elasticsearch to search over a warehouse
> > > table to improve search performance and get google like
> > > auto-complete (nice to have end goal)
It is a product database that for efficiency was thinking of adding a

warehouse table for it as it was previously connecting to a lot of tables
and becoming extremely slow.

So I was thinking of having a blob column here that would have all its

attributes.. ie ingredients, supplier, synonyms etc etc. lots of metadata
attributes.

Then if you do a search by ingredient, or supplier it should return

this product. (contains)

Is this possible with elasticsearch and a good approach. Looking for

advice here.

Thanks in advance





--

--
David Pilato
http://www.scrutmydocs.org/ http://www.scrutmydocs.org/
http://dev.david.pilato.fr/ http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(czar) #5

Thanks again David, much appreciated. I will give it a go and see how it
goes. One other question if I may, how do you handle updates and inserts.
This is a realtime system so products will be added / updated etc. Not high
volume but it does occur semi frequently. Does elastic search support live
updates etc. I cannot have any down time to searches going slow if a
re-index or something needs to occur.

On Tuesday, 28 August 2012 17:50:56 UTC+8, David Pilato wrote:

Ok. So you have two concerns:

  • index fast

  • search fast

Storing in your database as JSon blob could help as reading entities from
a database with many collections or deep tree structure could be really
slow. On my side, depending of entities, I can read my entities with from 5
entities per second to more than 1.2k entities per second. Elasticsearch
index it very fast.

Then, search is fast, really fast. Just give it a try.

What I love with ES is that you get back when you search, IDs but also
your JSon document source directly. You don't have to read your database to
display results.

Not sure I have answered.

David

Le 28 août 2012 à 11:33, czar <s.deg...@bigpond.com <javascript:>> a
écrit :

Thanks for the answer David. Great advice. No I do not expect it to
connect to the database table.

Was just wondering an optimised way of storing all the data. Would it make
sense to have a denormalised warehouse table of the product and all its
attributes. Maybe store all the attrinutes of the product in json format
string already in a blob file so it is easy to create the index from there
and recreate if needed. Once the search returns the product and id, I can
look the database up for any relating info. This is just to make the search
as quick as possible.

Thanks again

On Tuesday, 28 August 2012 16:09:33 UTC+8, David Pilato wrote:

As far as I understand your question, it's a common use case in
Elasticsearch.

But, do you think that Elasticsearch will connect to your database (table)
and perform some search on it?

If not, sorry for the following.

What you need to do before searching is indexing.

So, you send to elasticsearch some JSon content with your data, let's say
something like:

{

"ingredients":"",

"supplier":"",

"whateveryouwant":""

}

Elasticsearch will index it.

After that Elasticsearch will be able to find it regardless the field.
But, Elasticsearch won't connect to your database directly.

Sorry again if I answered outside the scope of your question.

David.

Le 28 août 2012 à 09:25, czar < s.deg...@bigpond.com> a écrit :

I am thinking of using elasticsearch to search over a warehouse table to
improve search performance and get google like auto-complete (nice to have
end goal)

It is a product database that for efficiency was thinking of adding a
warehouse table for it as it was previously connecting to a lot of tables
and becoming extremely slow.

So I was thinking of having a blob column here that would have all its
attributes.. ie ingredients, supplier, synonyms etc etc. lots of metadata
attributes.

Then if you do a search by ingredient, or supplier it should return this
product. (contains)

Is this possible with elasticsearch and a good approach. Looking for
advice here.

Thanks in advance

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(David Pilato) #6

ES handles updates very well.
You don't have to worry about that.

Le 28 août 2012 à 12:20, czar s.degenaar@bigpond.com a écrit :

Thanks again David, much appreciated. I will give it a go and see how it goes.
One other question if I may, how do you handle updates and inserts. This is a
realtime system so products will be added / updated etc. Not high volume but
it does occur semi frequently. Does elastic search support live updates etc. I
cannot have any down time to searches going slow if a re-index or something
needs to occur.

On Tuesday, 28 August 2012 17:50:56 UTC+8, David Pilato wrote:

Ok. So you have two concerns:

  • index fast
  • search fast

Storing in your database as JSon blob could help as reading entities from
a database with many collections or deep tree structure could be really
slow. On my side, depending of entities, I can read my entities with from 5
entities per second to more than 1.2k entities per second. Elasticsearch
index it very fast.

Then, search is fast, really fast. Just give it a try.

What I love with ES is that you get back when you search, IDs but also
your JSon document source directly. You don't have to read your database to
display results.

Not sure I have answered.
David

Le 28 août 2012 à 11:33, czar < s.deg...@bigpond.com> a écrit :

> > > Thanks for the answer David. Great advice. No I do not expect it
> > > to connect to the database table.
Was just wondering an optimised way of storing all the data. Would it

make sense to have a denormalised warehouse table of the product and all
its attributes. Maybe store all the attrinutes of the product in json
format string already in a blob file so it is easy to create the index
from there and recreate if needed. Once the search returns the product and
id, I can look the database up for any relating info. This is just to make
the search as quick as possible.

Thanks again

On Tuesday, 28 August 2012 16:09:33 UTC+8, David Pilato wrote:
  > > > >       As far as I understand your question, it's a common
  > > > > use case in Elasticsearch.
  But, do you think that Elasticsearch will connect to your database

(table) and perform some search on it?
If not, sorry for the following.

  What you need to do before searching is indexing.
  So, you send to elasticsearch some JSon content with your data,

let's say something like:
{
"ingredients":"",
"supplier":"",
"whateveryouwant":""
}
Elasticsearch will index it.
After that Elasticsearch will be able to find it regardless the
field. But, Elasticsearch won't connect to your database directly.

  Sorry again if I answered outside the scope of your question.


  David.




  Le 28 août 2012 à 09:25, czar < s.deg...@bigpond.com> a écrit :


   > > > > > I am thinking of using elasticsearch to search over a
   > > > > > warehouse table to improve search performance and get
   > > > > > google like auto-complete (nice to have end goal)
   It is a product database that for efficiency was thinking of

adding a warehouse table for it as it was previously connecting to a
lot of tables and becoming extremely slow.

   So I was thinking of having a blob column here that would have

all its attributes.. ie ingredients, supplier, synonyms etc etc. lots
of metadata attributes.

   Then if you do a search by ingredient, or supplier it should

return this product. (contains)

   Is this possible with elasticsearch and a good approach.

Looking for advice here.

   Thanks in advance





   --



  > > > > 
  --
  David Pilato
  http://www.scrutmydocs.org/ <http://www.scrutmydocs.org/>
  http://dev.david.pilato.fr/ <http://dev.david.pilato.fr/>
  Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


> > > 

--
David Pilato
http://www.scrutmydocs.org/ http://www.scrutmydocs.org/
http://dev.david.pilato.fr/ http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(system) #7