Can someone please explain to me why I am getting such a random result when I sort my data.
I am just doing a simple query, trying to return all the documents filtered my a companyId sorting my lastName in desc order.
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
The result looks as follows, but noticed that last names like van der Mescht and van Breda appears before Zwane and Zwezwe. Why would that be?
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
I am including my mappings as I think it might be the reason for it, but can't figure it out.
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
I enabled "fielddata": true, on the text fields, no i think this is the wrong approch as i think it should just work fine with .keyword but anywayz i am still not getting the correct order. Seems like the Asc is working ok but the desc is not at all