Scoring in Exact and Phrase Matching



I'm fairly new to Elasticsearch and Lucene. I quickly went through the definitive guide and was able to understand how the scoring is calculated for boolean, term and multi term queries. The basic weighting is TF-IDF and scoring is based on custom VSM. Depending on query construction finalqueryscore = (booleanqueryscore + termscore1+ termscore1.....) where booleanquery, termscores are based on custom VSM.

However, I'm not very clear on what kind of scoring is used for exact and phrase matching ? For exact match, is the score always 1 ? Similar to above, is phrasequeryscore = booleanqueryscore + termscore1+ proximity(Edit Distance)..... ?

The only relevant information I found is "Individual queries may combine the TF/IDF score with other factors such as the term proximity in phrase queries, or term similarity in fuzzy queries [1]." How exactly is proximity combined ?


(Nik Everett) #2

I'm not sure of a better place to look that the implementation:

Meaning, I don't remember seeing a place it was better documented. You still have to jump around into things like DocScorere#computeSlopFactor to get the full picture.

(system) #3