Scoring per term match

jpotisch · September 18, 2015, 7:40pm

For our application we search multiple variations of a field (phonetic, prefix, ngrams, etc.) in a should clause. We then use function_score functions to boost scoring based on business-specific fields so that more recent, more popular, etc. records rise to the top. We also allow fuzzy matches, a limited number of term misses (minimum_should_match = 75%, etc.)

Our app tries to determine the best match across types by using scores. So for example, let's say we search across publishers, authors, books, and magazines. A user might type in a name of any of the above, or search across fields, e.g. "william shakespeare" should find the author but "shakespeare hamlet" should find the book. They might also type "shakes haml" and we should bring back Hamlet, or "willem shakesper" and we should return the author.

Because of the other factors we use to determine ranking, I want a much simpler starting score than the full TF-IDF approach which prefers all query terms matching all field terms and weighs unique terms higher.

Is there a way to make scoring use a simple linear method such that each term match gets a set amount, regardless of TF-IDF, terms that don't match, etc? In other words I want a search for "william shakespeare hamlet" against the book { title: "Hamlet", author: "William Shakespeare" } to get 3 points, "shakespeare hamlet" to get 2 points, "hamlet" to get 1 point, etc. and for those same queries to return the same exact scores against the book { title: "Hamlet Is A Very Good Book", author: "William Shakespeare Was A Very Good Author" }

Any guidance, not just a full solution, would be greatly appreciated!

Thanks,

-joel

Topic		Replies	Views
Expecting another result(scoring) on function_score Elasticsearch	2	413	October 23, 2018
Scoring for words and fetch docs that get minimum score Elasticsearch	2	433	July 6, 2017
Change the scoring function for array using best score for elements Elasticsearch	4	1168	December 6, 2018
Need help In Function Scoring Elasticsearch	9	1206	July 5, 2017
Scoring docs that are not natural language Elasticsearch	7	777	July 5, 2017

Scoring per term match

Related topics