Sorted indices and early termination

I have gone about reading this documentation and blog on index sorting and early termination and I have a couple of questions. I have an index with multiple fields(Simplified mappings below) most of which are text fields with a custom analyzer(additionally I use BM25 as my similarity algorithm) and I have a field called rank which is an integer what I would like to sort my index based on and terminate early. I perform a multi_match query(simplified query below) based on these "text" fields, add a sort to the rank in the query, set "track_total_hits" : false and set "size": 1000(I want a total of 1000 docs returned)

Here's my question

  1. Since my query is not a bool query, how does ElasticSearch(ES) know wether a particular document should be considered or not?
  2. How many documents per shard will elastic search look at before terminating(1000 I guess but curious to know the answer here), also lets say I want it to look at a specific set number of documents per shard, can I use terminate_after instead of "track_total_hits": false?
  3. I am guessing the responses would be sorted in the order of the sort provided(In my example "rank"), but if I want to sort my responses based on BM25 instead I would need to perform rescoring, especially since the the score of each document is not returned.
  4. Do I need to the additional "sort" key in my query since my index is sorted by default?

Simplified version of my Index Mappings

PUT test
{
  "settings": {
    "index": {
      "sort.field": [
        "rank"
      ],
      "sort.order": [
        "asc"
      ]
    }
  },
  "mappings": {
    "data": {
      "properties": {
        "my_field1": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "my_field2": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "rank": {
          "type": "integer"
        }
      }
    }
  }
}

Simplified version of my query

GET /test/_search
{
  "query": {
    "multi_match": {
      "query": "electric",
      "fields": ["my_field1", "my_field2"]
    }
  },
  "sort": [
    {
      "rank": {
        "order": "asc"
      }
    }
  ],
  "track_total_hits": false,
  "size": 1000
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.