Ranking short vs. long documents vs nested docs


(randel-2) #1

A search with standard settings ranks hits in documents with fewer words higher than longer texts. The word count seems to also take all nested documents into consideration.

Lets say I have:

PUT /books/_doc/1
{
  "editions": [
    {
      "title": "apes"
    }
  ]
}

PUT /books/_doc/2
{
  "editions": [
    {
      "title": "a group of apes"
    }
  ]
}

PUT /books/_doc/3
{
  "editions": [
    {
      "title": "chimpansee kingdom"
    },
    {
      "title": "apes"
    }
  ]
}

Search:

POST /books/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "nested": {
            "path": "editions",
            "query": {
              "match": {
                "editions.title": {
                  "query": "apes"
                }
              }
            }
          }
        }
      ]
    }
  }
}

Right now book id 1 ranks highest followed by 2 and last comes 3.
Can I restrict that effect to the length of the nested document with the hit instead of taking all nested documents into account so that book id 1 and 3 rank hightest (equally) followed by 2?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.