Boosting documents based on the query provided


(gkwelding) #1

This is a bit of an odd one, although not really for those of us who work
in the eCommerce world I suppose.

I work for one of the largest children's retailers in the UK. We're
currently knocking together a demonstration of our eCommerce platform with
everything running off Elasticsearch rather than the 3rd party solution we
currently use for our search. The 3rd party solution costs us thousands of
£ every month, where as the architecture needed to run our entire site
(search, range pages and product details pages) off ES actually costs a
fraction of that.

I currently have most of the functionality working via ES just fine.
However, our content co-ordinators really require the ability (via a CMS of
our design) to boost certain VIP products, but based on the query provided.
They know the document ID, and they know the query string they want to
boost products for. I can write the CMS with a connector to get this
information into ES but I'm unsure of what to do next.

Two options I have considered are as follows:

  1. A separate ES index containing containing the document id (this is
    the product id), the query string to boost for and then the boost factor.
  2. A property on the document itself that contains the query string to
    boost for and the boost factor.

These boost factors would then be applied using the function_score
mechanism. Which one would you choose? Why? Is there an option I'm missing?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3b951a6-5d90-45b0-8223-494032d830d9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(gkwelding) #2

Nobody got any insights on this? Even a vague/suggestion guess would be
much appreciated.

On Tuesday, March 4, 2014 11:19:09 AM UTC, Garry Welding wrote:

This is a bit of an odd one, although not really for those of us who work
in the eCommerce world I suppose.

I work for one of the largest children's retailers in the UK. We're
currently knocking together a demonstration of our eCommerce platform with
everything running off Elasticsearch rather than the 3rd party solution we
currently use for our search. The 3rd party solution costs us thousands of
£ every month, where as the architecture needed to run our entire site
(search, range pages and product details pages) off ES actually costs a
fraction of that.

I currently have most of the functionality working via ES just fine.
However, our content co-ordinators really require the ability (via a CMS of
our design) to boost certain VIP products, but based on the query provided.
They know the document ID, and they know the query string they want to
boost products for. I can write the CMS with a connector to get this
information into ES but I'm unsure of what to do next.

Two options I have considered are as follows:

  1. A separate ES index containing containing the document id (this is
    the product id), the query string to boost for and then the boost factor.
  2. A property on the document itself that contains the query string to
    boost for and the boost factor.

These boost factors would then be applied using the function_score
mechanism. Which one would you choose? Why? Is there an option I'm missing?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28ec1077-9077-4b56-8b65-020d04816b50%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly-2) #3

I'd probably do the first route and construct your function_score query
dynamically based on that information right before sending the query into
ES.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2a758f1d-8520-40b7-9fc1-251f9cc07e58%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(gkwelding) #4

Yeah, maybe I'm trying to be too much of a smart arse by trying to do it
all in one query. Maybe I need to break it up a bit.

On Wednesday, March 5, 2014 1:23:16 PM UTC, Binh Ly wrote:

I'd probably do the first route and construct your function_score query
dynamically based on that information right before sending the query into
ES.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ada6e1c0-946f-44ad-8922-bbf43cf79563%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(gkwelding) #5

Ok, I managed to get this working, kind of... I had to split my query into
2 queries. I created a new index, lets call it boostvalues. In
boostvalues I had the following fields:

  1. item_code (string, not analyzed)
  2. query_string (string, analyzed with std analyzer)
  3. boost_factor (integer, not analyzed)

Then for every search query that comes in I do a search against the
boostvalues index first. I then use any returned results to dynamically
populate the function_score arguments like so:

    if(isset($boostResults) && is_array($boostResults) && 

(int)$boostResults['hits']['total']>0) {
foreach($boostResults['hits']['hits'] as $boostValue) {

$searchParams['body']['query']['function_score']['functions'][] = array(
"filter" => array("term" => array("id" =>
$boostValue['_source']['id'])),
"script_score" => array("script" =>
$boostValue['_source']['boostFactor'])
);
}
}

I'm using the official Elasticsearch PHP library by the way. This gives the
desired effect. The beauty of this mechanism that I didn't realise until I
tried it is that when getting the boost values I can also use function
scoring on that to modify the boost factor based on how close the query is
to the stored query string.

On Wednesday, March 5, 2014 1:23:16 PM UTC, Binh Ly wrote:

I'd probably do the first route and construct your function_score query
dynamically based on that information right before sending the query into
ES.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8f965f72-57e5-460f-ba59-ecec24d2e4fb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6