Terms agg not giving all terms for huge text fields

I have a text field which is big. (~3000 char). I am storing the flow/path of the user inside my application.

flow_path: "ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309,ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309,ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309,ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309,ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309"

flow_path: "ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309"

So, user steps are recorded inside this comma-separated string.

I want to be able to search and also aggregate on this field. So, by default, Elasticsearch created 1 analyzed and 1 non-analyzed string and I expected it to work for my usecase.

I am able to search using analyzed flow_path field as expected, but when aggregating on flow_path.keyword field, it is not producing all the terms as expected in Kibana.

Ex: (based on above example)
Term : Count
ABCD12345678,ABCD098765432,PQRS56789043,EFG321987309 : 1

Doc1 term is completely ignored. I increased the size parameter to huge value(20000), but still the issue persists.

Version: ES, Kibana 5.6.3

Please help how I can aggregate on such big text field getting all the terms.

If you do not provide an explicit mapping for your index, string fields will be mapped like this:

          "my_field": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256

If you go with this default mapping, the .keyword multifield (that you are aggregation on) has been mapped to ignore any values that have a length larger than 256 characters, because of the "ignore_above": 256 parameter. The value in your Doc1 is longer than that, and that's why you're not getting that document back with the terms aggregation.

To fix this, you could change your mappings to allow longer values, for example setting ignore_above to 512.

Note that you can not update the mappings for existing indexes. You will have to reindex your data to a new index that has the updated mappings applied.

Thanks for the reply. Looks like that's the issue, I checked my mapping, and its 256 chars.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.