Shingles and terms aggregation not working as expected


I am having issues with getting even a simple terms aggregation on a shingled field. For context, I want to be able to get a list of words/phrases found in the sentence/display_text field (using shingles), and their frequency of occurrence.

My index mapping is as follows:

    "settings": {
        "max_shingle_diff": 4,
        "analysis": {
            "analyzer": {
                "custom": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "shingle_filter"]
            "filter": {
                "shingle_filter": {
                    "type": "shingle",
                    "min_shingle_size": 2,
                    "max_shingle_size": 5,
                    "output_unigrams": "true"
    "mappings": {
        "properties": {
            "sentences": {
                "type": "nested",
                "include_in_root": "true",
                "properties": {
                    "display_text": {
                        "type": "text",
                        "fields": {
                            "shingles": {
                                "type": "text",
                                "analyzer": "custom"

And here is the query I run:

GET /books/_search
    "size": 0,
    "query": {
        "terms": {
            "title": ["Book1"]
    "aggs": {

        "sentences_count": {
            "nested": {
                "path": "sentences"
            "aggs": {
                "phrases": {
                    "terms": {
                        "field": "sentences.display_text.shingles",
                        "size": 100

However, I am getting an error that seems to still view the sentences.display_text.shingles as text, not shingle tokens?

"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [sentences.display_text.shingles] in order to load field data by uninverting the inverted index. Note that this can use significant memory."

Does anyone know what I am doing wrong? From previous discussions on here it seems that this would be the correct query structure and index mapping.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.