I have gone about reading this documentation and blog on index sorting and early termination and I have a couple of questions. I have an index with multiple fields(Simplified mappings below) most of which are text fields with a custom analyzer(additionally I use BM25 as my similarity algorithm) and I have a field called rank which is an integer what I would like to sort my index based on and terminate early. I perform a multi_match query(simplified query below) based on these "text" fields, add a sort to the rank in the query, set "track_total_hits" : false and set "size": 1000(I want a total of 1000 docs returned)
Here's my question
- Since my query is not a bool query, how does ElasticSearch(ES) know wether a particular document should be considered or not?
- How many documents per shard will elastic search look at before terminating(1000 I guess but curious to know the answer here), also lets say I want it to look at a specific set number of documents per shard, can I use terminate_after instead of "track_total_hits": false?
- I am guessing the responses would be sorted in the order of the sort provided(In my example "rank"), but if I want to sort my responses based on BM25 instead I would need to perform rescoring, especially since the the score of each document is not returned.
- Do I need to the additional "sort" key in my query since my index is sorted by default?
Simplified version of my Index Mappings
PUT test
{
"settings": {
"index": {
"sort.field": [
"rank"
],
"sort.order": [
"asc"
]
}
},
"mappings": {
"data": {
"properties": {
"my_field1": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"my_field2": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"rank": {
"type": "integer"
}
}
}
}
}
Simplified version of my query
GET /test/_search
{
"query": {
"multi_match": {
"query": "electric",
"fields": ["my_field1", "my_field2"]
}
},
"sort": [
{
"rank": {
"order": "asc"
}
}
],
"track_total_hits": false,
"size": 1000
}