Should sort missing values be normalized?

Hi all,
I'm Fabio. Nice to meet you guys.

With Elasticsearch it is possible to provide missing value in the sort predicates. Missing values can be even Strings, that's very nice since there's no support for that in Lucene.

My question is:

What happens when a normizer is defined on the field? Should those values be normalized too?

I think that currently Elasticsearch doesn't do it. Is this the expected behavior?
Many thanks, for now.

Below you can find the rest calls I made to reproduce the case.

(1) The Mapping 
Executed Elasticsearch HTTP PUT request to path '/indexname'. 
Request body: <
{
  "settings": {
    "analysis": {
      "normalizer": {
        "DefaultAnalysisDefinitions_lowercase": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "byType_normalizedString": {
        "type": "keyword",
        "index": true,
        "norms": false,
        "doc_values": true,
        "store": false,
        "normalizer": "DefaultAnalysisDefinitions_lowercase"
      }
    },
    "dynamic": "strict"
  }
}

(2) The Indexing 
Executed Elasticsearch HTTP POST request to path '/_bulk'
Request body: <
{
  "index": {
    "_index": "indexname",
    "_id": "2"
  }
}
{
  "byType_normalizedString": "george"
}
{
  "index": {
    "_index": "indexname",
    "_id": "1"
  }
}
{
  "byType_normalizedString": "Cecilia"
}
{
  "index": {
    "_index": "indexname",
    "_id": "3"
  }
}
{
  "byType_normalizedString": "Stefany"
}
{
  "index": {
    "_index": "indexname",
    "_id": "empty"
  }
}
{}

(3) Force refresh

(4) The Quering 
Executed Elasticsearch HTTP POST request to path '/indexname/_search' with query parameters {size=10000, track_total_hits=true}
Request body: <
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "byType_normalizedString": {
        "order": "asc",
        "missing": "Daniel"
      }
    }
  ]
}
>. 
Response body: <
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "hits": [
      {
        "_index": "indexname",
        "_type": "_doc",
        "_id": "empty",
        "_source": {},
        "sort": [
          "Daniel"
        ]
      },
      {
        "_index": "indexname",
        "_type": "_doc",
        "_id": "1",
        "_source": {
          "byType_normalizedString": "Cecilia"
        },
        "sort": [
          "cecilia"
        ]
      },
      {
        "_index": "indexname",
        "_type": "_doc",
        "_id": "2",
        "_source": {
          "byType_normalizedString": "george"
        },
        "sort": [
          "george"
        ]
      },
      {
        "_index": "indexname",
        "_type": "_doc",
        "_id": "3",
        "_source": {
          "byType_normalizedString": "Stefany"
        },
        "sort": [
          "stefany"
        ]
      }
    ]
  }
}
>

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.