Significant score increase after migration from 2.4 to 6.5

mgvarley · May 2, 2019, 4:41pm

We are in the process of migrating our stack from 2.4 to 6.5, our plan is to make the bare minimum changes to our application code and then once our regression tests have passed we will start taking advantage of some of the new features. All is going well however we have run into an issue with scoring where our ES 6.5 searches are returning significantly higher scores than the "same" search running on ES 2.4 with the same index settings.

For example the query below returns a score of 7.9601316 on ES 2.4 and 37.387424 on ES 6.5. Has there been a significant scoring change somewhere along the line (I haven't spotted anything in the release notes) that may be causing this?

Many thanks for any help or advice. If there is a setting I need to apply to get consistent scoring that would be great, we can then look to upgrade to the new default approach.

{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "Tom Dick Harry",
            "fields": [
              "field_1^1.2",
              "field_1.shingle^2",
              "field_1_*"
            ],
            "type": "best_fields",
            "minimum_should_match": "50%",
            "fuzziness": 1,
            "prefix_length": 2
          }
        }
      ],
      "should": [
        {
          "multi_match": {
            "query": "Tom Dick Harry",
            "fields": [
              "*field_x*"
            ],
            "type": "best_fields",
            "minimum_should_match": "75%",
            "fuzziness": 1,
            "prefix_length": 2,
            "_name": "primary_field_x"
          }
        },
        {
          "multi_match": {
            "query": "Bob",
            "fields": [
              "field_2*",
              "*field_x*"
            ],
            "type": "best_fields",
            "minimum_should_match": "100%",
            "fuzziness": 1,
            "prefix_length": 2,
            "_name": "field_4"
          }
        }
      ],
      "filter": [
        {
          "terms": {
            "field_5": [
              "alice"
            ]
          }
        },
        {
          "term": {
            "field_6": true
          }
        }
      ]
    }
  }
}

Christian_Dahlqvist · May 2, 2019, 5:21pm

For version 5.0 I believe Elasticsearch switched from TF/IDF to BM25 as the default scoring mechanism. Wonder if that could explain this?

mgvarley · May 2, 2019, 7:32pm

Thank you so much Christian, I had totally missed that change having gone directly from 2.4 to 6.x. Reading through the docs I understand that I need to tag all of my "type":"text" fields with "similarity":"classic" to get the results I am expecting. Re-indexing all of my documents now (around 100m) and once complete I will re-test. If everything passes I will start moving everything over to the new defaults. Thanks again for the quick reply!

mgvarley · May 3, 2019, 7:48am

Unfortunately this did not work, after re-indexing with the similarity:classic setting applied to all of my full text fields I am still getting significantly higher scores in 6.5 vs 2.4. Any other suggestions or pointers would be hugely appreciated as I am completely stumped.

system · May 31, 2019, 7:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 6 to 7 scoring change Elasticsearch migration	2	337	October 17, 2022
Understanding Elasticsearch scoring between versions Elasticsearch	2	476	January 7, 2020
Is there a major difference in ES5 queries and ES6? Elasticsearch	3	945	February 23, 2019
Inconsistent scores between versions Elasticsearch	2	756	February 7, 2017
Elasticsearch 6.0-beta1 - Boosting Elasticsearch	15	1685	September 20, 2017

Significant score increase after migration from 2.4 to 6.5

Related topics