How ElasticSearch supporting customized ranking


#1

Hello Elastic experts,

For each record, it has a few float features, and want to rank the results by linear combination of features.

For example,

User 1 has feature 1 with value is 0.1, and feature 2 with value 0.2;
User 2 has feature 1 with value is 0.3, and feature 2 with value 0.1;

If feature 1 has weight 1 and feature 2 has weight 2, the score of user 1 is 0.1 * 1 + 0.2 * 2 = 0.5, while user 2 has score 0.3 * 1 + 0.1 * 2 = 0.5, which user 1 should rank higher than user 2.

Wondering if ElasticSearch support such customized ranking. If any sample index design/code, it will be highly appreciated.

thanks in advance,
Lin


(Colin Goodheart-Smithe) #2

You will probably find the function score useful for what you are trying to do: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html


#3

Thanks Colin,

Wondering if the score functions are executed in parallel on each node so that it is fully scale? The concern is if score is executed on a single node, if results set is too big, it may reach the limit of a single node. So a bit more details on internals of score function is appreciated.

regards,
Lin


(Jason Wee) #4

iirc , if you are using filters and query in the function scores, it should not really matter. unless you function score query come with sorting..then it matter. how large data you have?


#5

Thanks Jason,

I need to return top N documents. In this case, wondering how Elastic Search doing internally? Is there a single node handling sorting? For dataset, it is about 1B documents and I want to return top 1M, and top 10M, any advice is appreciated.

regards,
Lin


#6

Hi Colin,

I studied the document and it is really helpful. I am a new user of ElasticSearch and still a bit lost how to use score function to implement my use case. Let us make it simple, each document has two fields, f1 and f2. f1 has weight 0.1 and f2 has weight 0.2, and I want to score like this, and appreciate if you could help to specify a bit more details which score function should I use?

  1. Retrieve the value of field f1, and multiply by 0.1;
  2. Retrieve the value of field f2, and multiply by 0.2;
  3. Sum the result from step 1 and step 2, which is the final score of the document;
  4. return top 100 document.

regards,
Lin


(Colin Goodheart-Smithe) #7
  1. Use the field-value-factor function score on field f1 with factor 0.1
  2. Use the field-value-factor function score on field f2 with factor 0.2
  3. Use the sum score mode and the replace boost mode
  4. Use the size parameter

(system) #8