Aggregate on a keyword

(tomer zaks) #1


I am getting an error:

Discover: Fielddata is disabled on text fields by default. Set fielddata=true on [transactionId] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

When I checked the field data doc I saw that it is recommended to make transactionId as keyword.

So i did:
PUT _template/template_1
"template": "filebeat-*",
"settings": {
"number_of_shards": 1
"mappings": {
"type1": {

      "properties": {
        "transactionId": {
          "type": "text",
          "fields": {
            "keyword": { 
              "type": "keyword"
        "created_at": {
          "type": "date",
          "format": "EEE MMM dd HH:mm:ss Z YYYY"

then in the mapping I get:

but I still get the same error. what should I do?

(tri-man) #2

transactionId.keyword is something ES created so you can use it with kibana effectively with aggregation.

To turn on "fielddata" as suggested, you need to do the following with the mapping (and feel free to swap the "standard" analyzer with others that fit your need, don't use "keyword" analyzer here)

"transactionId": { "type": "text", "fielddata": true, "analyzer": "standard" }

instead of what you have
"transactionId": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }

You use the "keyword" analyzer when you prefer to have the value stored to be treated as one token, not multiple tokens. As one token, it works best with kibana in terms of aggregation. But there are cases you might want to use multiple tokens where it's good for search but expensive for aggregation. That's why ES creates "transactionId.keyword" field so you can use it with kibana or when you want to search directly in "transactionId.keyword" for the exact match. The "transactionId" field by default will use the "standard" analyzer (I think) unless you also want to enforce it to use the "keyword" analyzer.

And lastly, when you use the "keyword" analyzer, you don't need to turn on "fielddata" because the value is treated as one token. You only turn it on when you want multiple tokens with different analyzer.

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.