Query string different score for similar document


(Prashant Agrawal) #1

Hi All,

I was doing some analysis with query string query and what I have come across is that for the same / Similar query I am getting different scores. See the details as below :

POST prashant/test
{
  "title" : "Downtown"
}

POST prashant/test
{
  "title" : "Downtown"
}
POST prashant/test
{
  "title" : "Downtown"
}

POST prashant/test
{
  "title" : "Downtown"
}

Now I am running a query to search as :

POST prashant/_search
{
  "from": 0,
  "size": 50,
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "fields": [
              "title"
            ],
            "query": "downtown",
            "allow_leading_wildcard": true
          }
        },
        {
          "multi_match": {
            "type": "phrase_prefix",
            "fields": [
              "title"
            ],
            "boost": 10,
            "query": "downtown"
          }
        }
      ]
    }
  },
  "_source": [
    "title"
  ]
}

And output of above query is :

  "hits": {
    "total": 4,
    "max_score": 3.1645029,
    "hits": [
      {
        "_index": "prashant",
        "_type": "test",
        "_id": "AV7NJ2PBko4-g528WSa6",
        "_score": 3.1645029,
        "_source": {
          "title": "Downtown"
        }
      },
      {
        "_index": "prashant",
        "_type": "test",
        "_id": "AV7NJ3Dnko4-g528WSa8",
        "_score": 3.1645029,
        "_source": {
          "title": "Downtown"
        }
      },
      {
        "_index": "prashant",
        "_type": "test",
        "_id": "AV7NJ2o3ko4-g528WSa7",
        "_score": 2.0055373,
        "_source": {
          "title": "Downtown"
        }
      },
      {
        "_index": "prashant",
        "_type": "test",
        "_id": "AV7NJ3bnko4-g528WSa9",
        "_score": 2.0055373,
        "_source": {
          "title": "Downtown"
        }
      }
    ] 

So, here the issue or concern which I have is that all 4 document has only one field with same value as Downtown , and when I am searching then why these documents are getting different score ?


(Jimferenczi) #2

The score of each document is computed in the local shards from the statistics present in the shards. Metrics like average size of documents and frequency of the term are local per shards so the same content can have different scores. If you want to align the scores of each shard you can use the dfs_query_then_fetch search type:


... which computes the distributed term frequencies.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.