Elasticsearch sorting issue

Jacques_du_Plessis · November 19, 2018, 9:33am

Can someone please explain to me why I am getting such a random result when I sort my data.
I am just doing a simple query, trying to return all the documents filtered my a companyId sorting my lastName in desc order.

//Return all the employees from a specific company ordering by lastName asc | desc

GET employee-index-sorting
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "companyId": 3179
        }
      }
    }
  },
  "sort": [
    {
      "lastName.keyword": { <-- Should this be keyword? or not_analyzed
        "order": "desc"
      }
    }
  ]
}

The result looks as follows, but noticed that last names like van der Mescht and van Breda appears before Zwane and Zwezwe. Why would that be?

{
        "_index": "employee-index",
        "_type": "_doc",
        "_id": "637467",
        "_score": null,
        "_source": {
          "companyId": 3179,
          "firstName": "Name",
          "lastName": "van der Mescht",
        },
        "sort": [
          "van der Mescht"
        ]
      },
      {
        "_index": "employee-index",
        "_type": "_doc",
        "_id": "678335",
        "_score": null,
        "_source": {
          "companyId": 3179,
          "firstName": "Name3",
          "lastName": "van Breda",
        },
        "sort": [
          "van Breda"
        ]
      },
      {
        "_index": "employee-index",
        "_type": "_doc",
        "_id": "113896",
        "_score": null,
        "_source": {
          "companyId": 3179,
          "firstName": "Name2",
          "lastName": "Zwezwe",
        },
        "sort": [
          "Zwezwe"
        ]
      },
      {
        "_index": "employee-index",
        "_type": "_doc",
        "_id": "639639",
        "_score": null,
        "_source": {
          "companyId": 3179,
          "firstName": "Name1",
          "lastName": "Zwane",
        },
        "sort": [
          "Zwane"
        ]
      }

I am including my mappings as I think it might be the reason for it, but can't figure it out.

PUT employee-index-sorting
{
  "settings": {
    "index": {
      "analysis": {
        "filter": {},
        "analyzer": {
          "keyword_analyzer": {
            "filter": [
              "lowercase",
              "asciifolding",
              "trim"
            ],
            "char_filter": [],
            "type": "custom",
            "tokenizer": "keyword"
          },
          "edge_ngram_analyzer": {
            "filter": [
              "lowercase"
            ],
            "tokenizer": "edge_ngram_tokenizer"
          },
          "edge_ngram_search_analyzer": {
            "tokenizer": "lowercase"
          }
        },
        "tokenizer": {
          "edge_ngram_tokenizer": {
            "type": "edge_ngram",
            "min_gram": 2,
            "max_gram": 5,
            "token_chars": [
              "letter"
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "employeeId": {
          "type": "keyword"
        },
        "companyGroupId": {
          "type": "keyword"
        },
        "companyId": {
          "type": "keyword"
        },
        "number": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "preferredName": {
          "type": "text",
          "index": false
        },
        "firstName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "middleName": {
          "type": "text",
          "index": false
        },
        "lastName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "fullName": {
          "type": "text",
          "fields": {
            "keywordstring": {
              "type": "text",
              "analyzer": "keyword_analyzer"
            },
            "edgengram": {
              "type": "text",
              "analyzer": "edge_ngram_analyzer",
              "search_analyzer": "edge_ngram_search_analyzer"
            }
          },
          "analyzer": "standard"
        },
        "terminationDate": {
          "type": "date"
        },
        "companyName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "email": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "idNumber": {
          "type": "text"
        },
        "description": {
          "type": "text",
          "index": false
        },
        "jobNumber": {
          "type": "keyword"
        },
        "frequencyId": {
          "type": "long"
        },
        "frequencyCode": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "frequencyAccess": {
          "type": "boolean"
        }
      }
    }
  }
}

I enabled "fielddata": true, on the text fields, no i think this is the wrong approch as i think it should just work fine with .keyword but anywayz i am still not getting the correct order. Seems like the Asc is working ok but the desc is not at all

zz_hello · November 20, 2018, 2:47am

maybe this link will help you
https://www.elastic.co/guide/en/elasticsearch/guide/current/sorting-collations.html
the reason may be the word "van" start with a lowercase, but 'Zwezwe' is not

Jacques_du_Plessis · November 20, 2018, 6:37am

Thanks for the response, turns out you were right. for a more in-depth answer please see SO link

system · December 18, 2018, 6:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sort not working as expected Elastic Search	4	181	June 26, 2024
Sorting not working properly Elasticsearch	4	1315	October 10, 2019
Elasticsearch sort returns incorrect results? Kibana	2	447	March 15, 2023
Elasticsearch query search sort return null Elasticsearch	6	1614	January 5, 2021
Sorting is not giving accurate result Elasticsearch	3	1052	July 5, 2017

Elasticsearch sorting issue

Related topics