Design for ecommerce website

Hi,

I have a question about designin/using ElasticSearch for ecommerce website.

Imagine we have to sell shoes. We have serveral model, and for each model,
different sizes and colors : '{"model":"converse all star",
"title":"Converse All Star", "img":"/img/converseallstar-red-8.jpg",
"size":"8", "color":"red","price":39}'

If i search converse models, i want to have in my results only one shoe per
model (with for example the min price, and title and img )

In SQL world, if i search all converse models, i'll do "SELECT model,
min(price) FROM table WHERE model like '%converse%' GROUP BY model"

Unfortunately, it seems field collapsing is not implemented. I try
implements this in Solr, wich support field collapsing, and it's working
great, but i'd prefer use ElasticSearch.

One solution will be to get first distinct models, and make one query for
each model, but i think it'll be slow.

Or maybe nested documents, or parent/child is my solution, but i don't
understand how to implement them

Here is my test dataset :
curl -X PUT http://localhost:9200/article/shoe/converseallstar-red-8 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-red-8.jpg", "size":"8",
"color":"red","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-red-9 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-red-9.jpg", "size":"9",
"color":"red","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-blue-8 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-blue-8.jpg", "size":"8",
"color":"blue","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-blue-9 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-blue-9.jpg", "size":"9",
"color":"blue","price":39}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-red-8 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-red-8.jpg", "size":"8", "color":"red","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-red-9 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-red-9.jpg", "size":"8", "color":"red","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-blue-8 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-blue-8.jpg", "size":"8", "color":"blue","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-blue-9 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-blue-9.jpg", "size":"8", "color":"blue","price":49}'

With this dataset, here the result i expect :

  • "converse" => [{"model":"converse all star", "title":"Converse All
    Star", "img":"/img/converseallstar-...whatever...jpg"},'{"model":"converse
    hi", "title":"Converse Hi", "img":"/img/conversehi...whatever...jpg"]
  • "red converse" => [{"model":"converse all star", "title":"Converse All
    Star",
    "img":"/img/converseallstar-red-...whatever...jpg"},'{"model":"converse
    hi", "title":"Converse Hi", "img":"/img/conversehi-red-...whatever...jpg"]

What are my possibilities?

Mickael

--

If the model is a limited set of models, then you can indeed issue a
separate query for each, using the multi-search API
[http://www.elasticsearch.org/guide/reference/api/multi-search.html]. You
can also use parent/child and the top_children query
[http://www.elasticsearch.org/guide/reference/query-dsl/top-children-query.html].
Obviously, you can also retrieve a larger result set, and aggregate the
results into models on the client.

You can either sort on the price, or use it as a weighting factor for
relevancy with the custom_score query
[http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query.html]

Karel

On Thursday, November 8, 2012 4:23:23 PM UTC+1, Mickael Magniez wrote:

Hi,

I have a question about designin/using ElasticSearch for ecommerce website.

Imagine we have to sell shoes. We have serveral model, and for each model,
different sizes and colors : '{"model":"converse all star",
"title":"Converse All Star", "img":"/img/converseallstar-red-8.jpg",
"size":"8", "color":"red","price":39}'

If i search converse models, i want to have in my results only one shoe
per model (with for example the min price, and title and img )

In SQL world, if i search all converse models, i'll do "SELECT model,
min(price) FROM table WHERE model like '%converse%' GROUP BY model"

Unfortunately, it seems field collapsing is not implemented. I try
implements this in Solr, wich support field collapsing, and it's working
great, but i'd prefer use ElasticSearch.

One solution will be to get first distinct models, and make one query for
each model, but i think it'll be slow.

Or maybe nested documents, or parent/child is my solution, but i don't
understand how to implement them

Here is my test dataset :
curl -X PUT http://localhost:9200/article/shoe/converseallstar-red-8 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-red-8.jpg", "size":"8",
"color":"red","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-red-9 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-red-9.jpg", "size":"9",
"color":"red","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-blue-8 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-blue-8.jpg", "size":"8",
"color":"blue","price":39}'
curl -X PUT http://localhost:9200/article/shoe/converseallstar-blue-9 -d
'{"model":"converse all star", "title":"Converse All Star",
"img":"/img/converseallstar-blue-9.jpg", "size":"9",
"color":"blue","price":39}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-red-8 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-red-8.jpg", "size":"8", "color":"red","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-red-9 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-red-9.jpg", "size":"8", "color":"red","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-blue-8 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-blue-8.jpg", "size":"8", "color":"blue","price":49}'
curl -X PUT http://localhost:9200/article/shoe/conversehi-blue-9 -d
'{"model":"converse hi", "title":"Converse Hi",
"img":"/img/conversehi-blue-9.jpg", "size":"8", "color":"blue","price":49}'

With this dataset, here the result i expect :

  • "converse" => [{"model":"converse all star", "title":"Converse All
    Star", "img":"/img/converseallstar-...whatever...jpg"},'{"model":"converse
    hi", "title":"Converse Hi", "img":"/img/conversehi...whatever...jpg"]
  • "red converse" => [{"model":"converse all star", "title":"Converse All
    Star",
    "img":"/img/converseallstar-red-...whatever...jpg"},'{"model":"converse
    hi", "title":"Converse Hi", "img":"/img/conversehi-red-...whatever...jpg"]

What are my possibilities?

Mickael

--

Thanks for your response,

No, there is lot of models (in fact website have much more stuffs than
shoes, but i take a small dataset for the example)

I'll take a look to parent/child

--