Puzzling Scoring situation using multi_match

I am having a hard time understading why the scoring of record #1 is higher than #2.
The one on #2 has more matches (see Highlight entry). Both terms are found on fields with higher boosting criteria ss_categories_full vs ss_name_full and it has matches without the need of fuzziness, yet the score is much lower (48 vs 70).
I tried the multi_match w/ best_fields and most_fields but the answer is the same. I am puzzled!

Highlight for Max Score record:

"highlight": {
          "ss_name_full": [
            "Lundhs <em>EmeraldÂ</em>®"
          ]
        }

Highlight for 2nd Position Record:

"highlight": {
      "ss_searchable_1": [
        "<em>emerald</em> palace",
        "wide width <em>textiles</em>"
      ],
      "ss_name_full": [
        "Standard <em>Textile</em>"
      ],
      "ss_categories_full": [
        "<em>Textile</em>",
        "<em>Textile</em>>Vertically Hanging <em>Textile</em>",
        "<em>Textile</em>"
      ]
    }

Important part of the Query:

"minimum_should_match": 1,
                    "should": [
                        {
                            "match_phrase": {
                                "ss_name_full": {
                                    "query": "emerald textile"
                                }
                            }
                        },
                        {
                            "multi_match": {
                                "boost": 1,
                                "fields": [
                                    "sku^10",
                                    "rfid^10",
                                    "ss_categories_full^8",
                                    "ss_name_full^5",
                                    "manufacturer_sku.txt^3.2",
                                    "ss_searchable_1^3",
                                    "ss_searchable_2^2",
                                    "ss_searchable_3"
                                ],
                                "fuzziness": "1",
                                "query": "emerald textile",
                                "slop": 2,
                                "tie_breaker": 0.3,
                                "type": "most_fields"
                            }
                        }
                    ]

Results w/ Explain = true

{
  "took": 21,
  "timed_out": false,
  "_shards": {
    "total": 9,
    "successful": 9,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": 70.357056,
    "hits": [
      {
        "_shard": "[materialbank_product_en_v4][7]",
        "_node": "Q2Mj0tZETx6ZeFHu9t-hLA",
        "_index": "materialbank_product_en_v4",
        "_id": "821937",
        "_score": 70.357056,
        "_ignored": [
          "install_07",
          "install_06",
          "install_05",
          "install_04",
          "install_09",
          "install_08",
          "install_10",
          "install_03",
          "install_02",
          "install_01"
        ],
        "_source": {
          "ss_searchable_1": [
            "low emitting",
            "low emitting/low vocs",
            "fireplace",
            "fireplace/fireplace surround",
            "flooring",
            "kitchen",
            "furniture",
            "stair & elevator",
            "bath",
            "countertop",
            "wall/backsplash",
            "pool & fountain",
            "wall",
            "the lundhs real stone collection",
            "astm c-170 - 38850 psi",
            "stone",
            "norway",
            "fireplace surround",
            "backsplash",
            "slab",
            "modular",
            "natural",
            "leed compliant",
            "large format tile",
            "field tile"
          ],
          "ss_searchable_2": [
            "atsm c-1352 - 171.1 wear index value; en 14157",
            "typically stocked",
            "stocked in north america",
            "made to order / special order",
            "medium",
            "light",
            "monochromatic",
            "polished silk/leather honed",
            "rectangle",
            "square",
            "indoor",
            "outdoor",
            "may contribute toward leed credits.",
            "natural",
            "residential use",
            "commercial use",
            "freeze/thaw resistant"
          ],
          "ss_searchable_3": [
            "material",
            "bs en 12720:2009 - resistant",
            "astm c-99 modulus of rupture - 1846 psi",
            "atsm c-97 - 0.26% resistant"
          ],
          "rfid": null,
          "sku": "100511650",
          "manufacturer_sku": [
            "Emerald-Polished",
            "Emerald-Silk-Leather",
            "Emerald-Honed"
          ]
        },
        "highlight": {
          "ss_name_full": [
            "Lundhs <em>EmeraldÂ</em>®"
          ]
        },
        "_explanation": {
          "value": 70.357056,
          "description": "sum of:",
          "details": [
            {
              "value": 70.357056,
              "description": "weight(FunctionScoreQuery((ss_name_full:\"emerald textile\" ((ss_name_full:emerald (ss_name_full:emeralda)^0.85714287 (ss_name_full:emeraldâ)^0.85714287 ss_name_full:textile (ss_name_full:textiles)^0.85714287)^5.0 | MatchNoDocsQuery(\"empty BooleanQuery\") | MatchNoDocsQuery(\"empty BooleanQuery\") | (Synonym(ss_searchable_2:textil ss_searchable_2:textile))^2.0 | (manufacturer_sku.txt:emerald)^3.2 | (Synonym(ss_categories_full:textil ss_categories_full:textile))^8.0 | Synonym(ss_searchable_3:textil ss_searchable_3:textile) | (ss_searchable_1:emerald Synonym(ss_searchable_1:textil ss_searchable_1:textile))^3.0)~0.3 #visibility:Catalog| Search)~1, scored by boost(queryboost(score(qty:[-2147483648 TO 3]))^1.0))), result of:",
              "details": [
                {
                  "value": 70.357056,
                  "description": "sum of:",
                  "details": [
                    {
                      "value": 70.357056,
                      "description": "max plus 0.3 times others of:",
                      "details": [
                        {
                          "value": 70.357056,
                          "description": "sum of:",
                          "details": [
                            {
                              "value": 35.178528,
                              "description": "weight(ss_name_full:emeralda in 14860) [PerFieldSimilarity], result of:",
                              "details": [
                                {
                                  "value": 35.178528,
                                  "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                  "details": [
                                    {
                                      "value": 9.428572,
                                      "description": "boost",
                                      "details": []
                                    },
                                    {
                                      "value": 8.087933,
                                      "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                      "details": [
                                        {
                                          "value": 10,
                                          "description": "n, number of documents containing term",
                                          "details": []
                                        },
                                        {
                                          "value": 34176,
                                          "description": "N, total number of documents with field",
                                          "details": []
                                        }
                                      ]
                                    },
                                    {
                                      "value": 0.46131146,
                                      "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                      "details": [
                                        {
                                          "value": 1,
                                          "description": "freq, occurrences of term within document",
                                          "details": []
                                        },
                                        {
                                          "value": 1.2,
                                          "description": "k1, term saturation parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 0.75,
                                          "description": "b, length normalization parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 7,
                                          "description": "dl, length of field",
                                          "details": []
                                        },
                                        {
                                          "value": 7.2602997,
                                          "description": "avgdl, average length of field",
                                          "details": []
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            },
                            {
                              "value": 35.178528,
                              "description": "weight(ss_name_full:emeraldâ in 14860) [PerFieldSimilarity], result of:",
                              "details": [
                                {
                                  "value": 35.178528,
                                  "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                  "details": [
                                    {
                                      "value": 9.428572,
                                      "description": "boost",
                                      "details": []
                                    },
                                    {
                                      "value": 8.087933,
                                      "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                      "details": [
                                        {
                                          "value": 10,
                                          "description": "n, number of documents containing term",
                                          "details": []
                                        },
                                        {
                                          "value": 34176,
                                          "description": "N, total number of documents with field",
                                          "details": []
                                        }
                                      ]
                                    },
                                    {
                                      "value": 0.46131146,
                                      "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                      "details": [
                                        {
                                          "value": 1,
                                          "description": "freq, occurrences of term within document",
                                          "details": []
                                        },
                                        {
                                          "value": 1.2,
                                          "description": "k1, term saturation parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 0.75,
                                          "description": "b, length normalization parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 7,
                                          "description": "dl, length of field",
                                          "details": []
                                        },
                                        {
                                          "value": 7.2602997,
                                          "description": "avgdl, average length of field",
                                          "details": []
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "value": 0,
                      "description": "match on required clause, product of:",
                      "details": [
                        {
                          "value": 0,
                          "description": "# clause",
                          "details": []
                        },
                        {
                          "value": 1,
                          "description": "visibility:Catalog| Search",
                          "details": []
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 0,
              "description": "match on required clause, product of:",
              "details": [
                {
                  "value": 0,
                  "description": "# clause",
                  "details": []
                },
                {
                  "value": 1,
                  "description": "FieldExistsQuery [field=_primary_term]",
                  "details": []
                }
              ]
            }
          ]
        }
      },
      {
        "_shard": "[materialbank_product_en_v4][5]",
        "_node": "pA3QwgxUT4W8IOSDkb2QWQ",
        "_index": "materialbank_product_en_v4",
        "_id": "502774",
        "_score": 48.005,
        "_source": {
          "ss_searchable_1": [
            "window",
            "haze",
            "twilight",
            "emerald palace",
            "city scape",
            "volcano",
            "static",
            "woven",
            "polyester",
            "roll",
            "wide width textiles",
            "abstract",
            "texture"
          ],
          "ss_searchable_2": [
            "flammability",
            "typically stocked",
            "none",
            "trevira cs",
            "light",
            "medium",
            "polychromatic",
            "nfpa 701",
            "indoor",
            "translucent / sheer",
            "abstract / organic",
            "texture",
            "commercial use",
            "contemporary"
          ],
          "ss_searchable_3": [
            "material",
            "railroaded",
            "small"
          ],
          "rfid": null,
          "sku": "100156051",
          "manufacturer_sku": [
            "SD013807",
            "SD013806",
            "SD013801",
            "SD013803",
            "SD013804",
            "SD013805"
          ]
        },
        "highlight": {
          "ss_searchable_1": [
            "<em>emerald</em> palace",
            "wide width <em>textiles</em>"
          ],
          "ss_name_full": [
            "Standard <em>Textile</em>"
          ],
          "ss_categories_full": [
            "<em>Textile</em>",
            "<em>Textile</em>>Vertically Hanging <em>Textile</em>",
            "<em>Textile</em>"
          ]
        },
        "_explanation": {
          "value": 48.005,
          "description": "sum of:",
          "details": [
            {
              "value": 48.005,
              "description": "weight(FunctionScoreQuery((ss_name_full:\"emerald textile\" ((ss_name_full:emerald ss_name_full:textile (ss_name_full:textiles)^0.85714287)^5.0 | (Synonym(ss_searchable_2:textil ss_searchable_2:textile))^2.0 | MatchNoDocsQuery(\"empty BooleanQuery\") | MatchNoDocsQuery(\"empty BooleanQuery\") | (manufacturer_sku.txt:emerald)^3.2 | (Synonym(ss_categories_full:textil ss_categories_full:textile))^8.0 | (ss_searchable_1:emerald (ss_searchable_1:esmerald)^0.85714287 Synonym(ss_searchable_1:textil ss_searchable_1:textile))^3.0 | Synonym(ss_searchable_3:textil ss_searchable_3:textile))~0.3 #visibility:Catalog| Search)~1, scored by boost(queryboost(score(qty:[-2147483648 TO 3]))^1.0))), result of:",
              "details": [
                {
                  "value": 48.005,
                  "description": "sum of:",
                  "details": [
                    {
                      "value": 48.005,
                      "description": "max plus 0.3 times others of:",
                      "details": [
                        {
                          "value": 10.20312,
                          "description": "sum of:",
                          "details": [
                            {
                              "value": 10.20312,
                              "description": "weight(ss_name_full:textile in 1857) [PerFieldSimilarity], result of:",
                              "details": [
                                {
                                  "value": 10.20312,
                                  "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                  "details": [
                                    {
                                      "value": 11,
                                      "description": "boost",
                                      "details": []
                                    },
                                    {
                                      "value": 2.1248493,
                                      "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                      "details": [
                                        {
                                          "value": 4068,
                                          "description": "n, number of documents containing term",
                                          "details": []
                                        },
                                        {
                                          "value": 34059,
                                          "description": "N, total number of documents with field",
                                          "details": []
                                        }
                                      ]
                                    },
                                    {
                                      "value": 0.4365281,
                                      "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                      "details": [
                                        {
                                          "value": 1,
                                          "description": "freq, occurrences of term within document",
                                          "details": []
                                        },
                                        {
                                          "value": 1.2,
                                          "description": "k1, term saturation parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 0.75,
                                          "description": "b, length normalization parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 8,
                                          "description": "dl, length of field",
                                          "details": []
                                        },
                                        {
                                          "value": 7.266831,
                                          "description": "avgdl, average length of field",
                                          "details": []
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            }
                          ]
                        },
                        {
                          "value": 21.658363,
                          "description": "weight(Synonym(ss_categories_full:textil ss_categories_full:textile) in 1857) [PerFieldSimilarity], result of:",
                          "details": [
                            {
                              "value": 21.658363,
                              "description": "score(freq=8.0), computed as boost * idf * tf from:",
                              "details": [
                                {
                                  "value": 17.6,
                                  "description": "boost",
                                  "details": []
                                },
                                {
                                  "value": 1.3459563,
                                  "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                  "details": [
                                    {
                                      "value": 8865,
                                      "description": "n, number of documents containing term",
                                      "details": []
                                    },
                                    {
                                      "value": 34059,
                                      "description": "N, total number of documents with field",
                                      "details": []
                                    }
                                  ]
                                },
                                {
                                  "value": 0.91428584,
                                  "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                  "details": [
                                    {
                                      "value": 8,
                                      "description": "termFreq=8.0",
                                      "details": []
                                    },
                                    {
                                      "value": 1.2,
                                      "description": "k1, term saturation parameter",
                                      "details": []
                                    },
                                    {
                                      "value": 0.75,
                                      "description": "b, length normalization parameter",
                                      "details": []
                                    },
                                    {
                                      "value": 6,
                                      "description": "dl, length of field",
                                      "details": []
                                    },
                                    {
                                      "value": 12.00003,
                                      "description": "avgdl, average length of field",
                                      "details": []
                                    }
                                  ]
                                }
                              ]
                            }
                          ]
                        },
                        {
                          "value": 38.446556,
                          "description": "sum of:",
                          "details": [
                            {
                              "value": 25.229454,
                              "description": "weight(ss_searchable_1:emerald in 1857) [PerFieldSimilarity], result of:",
                              "details": [
                                {
                                  "value": 25.229454,
                                  "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                  "details": [
                                    {
                                      "value": 6.6000004,
                                      "description": "boost",
                                      "details": []
                                    },
                                    {
                                      "value": 6.0805163,
                                      "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                      "details": [
                                        {
                                          "value": 76,
                                          "description": "n, number of documents containing term",
                                          "details": []
                                        },
                                        {
                                          "value": 33449,
                                          "description": "N, total number of documents with field",
                                          "details": []
                                        }
                                      ]
                                    },
                                    {
                                      "value": 0.62867105,
                                      "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                      "details": [
                                        {
                                          "value": 1,
                                          "description": "freq, occurrences of term within document",
                                          "details": []
                                        },
                                        {
                                          "value": 1.2,
                                          "description": "k1, term saturation parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 0.75,
                                          "description": "b, length normalization parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 17,
                                          "description": "dl, length of field",
                                          "details": []
                                        },
                                        {
                                          "value": 52.639362,
                                          "description": "avgdl, average length of field",
                                          "details": []
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            },
                            {
                              "value": 13.2171,
                              "description": "weight(Synonym(ss_searchable_1:textil ss_searchable_1:textile) in 1857) [PerFieldSimilarity], result of:",
                              "details": [
                                {
                                  "value": 13.2171,
                                  "description": "score(freq=1.0), computed as boost * idf * tf from:",
                                  "details": [
                                    {
                                      "value": 6.6000004,
                                      "description": "boost",
                                      "details": []
                                    },
                                    {
                                      "value": 3.1854353,
                                      "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                      "details": [
                                        {
                                          "value": 1383,
                                          "description": "n, number of documents containing term",
                                          "details": []
                                        },
                                        {
                                          "value": 33449,
                                          "description": "N, total number of documents with field",
                                          "details": []
                                        }
                                      ]
                                    },
                                    {
                                      "value": 0.62867105,
                                      "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                                      "details": [
                                        {
                                          "value": 1,
                                          "description": "termFreq=1.0",
                                          "details": []
                                        },
                                        {
                                          "value": 1.2,
                                          "description": "k1, term saturation parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 0.75,
                                          "description": "b, length normalization parameter",
                                          "details": []
                                        },
                                        {
                                          "value": 17,
                                          "description": "dl, length of field",
                                          "details": []
                                        },
                                        {
                                          "value": 52.639362,
                                          "description": "avgdl, average length of field",
                                          "details": []
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "value": 0,
                      "description": "match on required clause, product of:",
                      "details": [
                        {
                          "value": 0,
                          "description": "# clause",
                          "details": []
                        },
                        {
                          "value": 1,
                          "description": "visibility:Catalog| Search",
                          "details": []
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 0,
              "description": "match on required clause, product of:",
              "details": [
                {
                  "value": 0,
                  "description": "# clause",
                  "details": []
                },
                {
                  "value": 1,
                  "description": "FieldExistsQuery [field=_primary_term]",
                  "details": []
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

Hi,

the elasticsearch default scoring function: BM25 algorithm, uses not only the term factor but also shard stats for terms.

It is no wonder even if the same texts in different fields give different scores for same query. If you want to score texts of different fields just by the text itself, you may need function_score query and/or constant_score query and using match pharses in filter context to construct your query for the customized scoring.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.