Is sorting on Text/String field no longer available in 5.x?

I understand that this is a new change, but I can't see what's wrong with this mapping. This is taken from the mapping of the field registered in ElasticSearch through _mapping call:

"key": {
    "type": "text",
    "fields": {
        "keyword": {
             "type": "keyword",
             "ignore_above": 256
        }
    }
}

When I sort using the "key" field, I get this exception. I have also tried adding fieldData=true, but that also didn't work.

Caused by: RemoteTransportException[[_6qwpaI][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: IllegalArgumentException[Fielddata is disabled on text fields by default. Set fielddata=true on [key] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.];
Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [key] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.
	at org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:335)
	at org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:111)
	at org.elasticsearch.index.query.QueryShardContext.getForField(QueryShardContext.java:167)
	at org.elasticsearch.search.sort.FieldSortBuilder.build(FieldSortBuilder.java:281)
	at org.elasticsearch.search.sort.SortBuilder.buildSort(SortBuilder.java:151)
	at org.elasticsearch.search.SearchService.parseSource(SearchService.java:678)
	at org.elasticsearch.search.SearchService.createContext(SearchService.java:536)
	at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:502)
	at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:243)
	at org.elasticsearch.action.search.SearchTransportService.lambda$registerRequestHandler$6(SearchTransportService.java:276)
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33)
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69)
	at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:550)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

At first I thought this might be related to the use of a field (e.g. key.keyword), but since I'm not doing that, then I don't see any reason why this is not working.

I'm pretty new to ElasticSearch 5.x and the document is contradicting itself so I hope someone can point me to the right direction. This is what I referring to:

https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html

I'm essentially trying to achieve the same as this original mapping that was working until 5.x:

"mapping": {
    "type": "string",
    "fields": {
        "raw": {
            "type": "string",
            "ignore_above": 256
        },
        "english": {
            "type": "string",
            "analyzer": "english"
        }
    }
}

This is what you need to do though, instead of sorting on key, sort on key.keyword, which will use the non-analyzed version (that has doc_values) for sorting.

I switched to sort by keyword and that sorted out the issue. Is this field now autogenerated? It got created without me specifying anything with field name "keyword". As for the fields I specified, they just don't get added in the mapping. This is my mapping:

{
                "strings": {
                    "match_mapping_type": "text",
                    "mapping": {
                        "type": "text",
                        "fields": {
                            "raw": {
                                "type": "keyword",
                                "ignore_above": 256
                            },
                            "english": {
                                "type": "text",
                                "analyzer": "english"
                            }
                        }
                    }
                }
            }

If you index a string with no mapping, ES in 5.0+ now automatically creates a text version and a keyword version (under .keyword) of the field.

But I have mapping as default template, but it's not honoring the mapping in the template. I guess that's another breaking change between 2.x to 5.x? This is a major issue.

This is because you have a typo in your dynamic mapping configuration, you are trying to match fields of type "text", but it should be "string" instead. For example, this works:

 PUT /test?pretty
 {
   "mappings": {
     "doc": {
       "dynamic_templates": [
         {
           "strings": {
             "match_mapping_type": "string",
             "mapping": {
               "type": "text",
               "fields": {
                 "raw": {
                   "type": "keyword",
                   "ignore_above": 42
                 },
                 "english": {
                   "type": "text",
                   "analyzer": "english"
                 }
               }
             }
           }
         }
       ]
     }
   }
 }

 POST /test/doc/1
 {"body": "foo"}

 GET /test/doc/_mapping?pretty

I thought there is no longer type "string" in 5.x? That's why I changed it to "text". So to make it work, I need to continue using string? That doesn't sound right.

There's a disconnect for this, the "dynamic" type used for match_mapping_type is the type of the field, not necessarily the ES type. For instance, match_mapping_type only supports "long", not "integer" because it maps to the data type rather than an ES type. So even though ES itself uses "text" and "keyword", the data type is still a "string".

I agree this is confusing, there was a PR here: Elasticsearch should reject dynamic templates with unknown `match_mapping_type`. by jpountz · Pull Request #17285 · elastic/elasticsearch · GitHub for 5.0+ that adds deprecation logging for this, and I opened Throw an exception on unrecognized "match_mapping_type" by dakrone · Pull Request #22090 · elastic/elasticsearch · GitHub so 6.0 will throw an exception when an unrecognized type is used.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.