Truncate keyword to specific length and store

adyjayex · August 12, 2017, 7:18pm

Hello,

I'm trying to retrieve only 128 characters of a text field I'm storing due to the fact that the text field is often extremely large, but for preview purposes I'd still like to retrieve part of that field without significant delay over slower connections.

Truncate filter seemed to match, but after testing it on the multi-field, the stored value seems to be the same as the original value.

   "truncate_keyword_analyzer": {
              "type": "custom",
              "tokenizer": "keyword",
              "filter": [
                "truncate_filter"
              ]
            }

And

"content": {
          "type": "text",
          "norms": false,
          "analyzer": "standard",
          "fields": {
            "preview": {
              "type": "text",
              "store": true,
              "index": false,
              "norms": false,
              "analyzer": "truncate_keyword_analyzer"
            }
          }
        }

When the content field has, for example: "All I want for Christmas is a truncated text field", the content.preview field would have "All I want for Christmas", but it doesn't.

Is there something wrong with the current setup ? Should I look into other filters or tokenizers ? Or is this just not possible at the moment ?

Thanks!

jasontedor · August 13, 2017, 4:18pm

We never modify source, the analyzer is used to produce the terms for building the index but the source document goes into _source unchanged. If you want source modified, you have to pre-process your document either client-side or using an ingest pipeline. Alternatively, if you use copy_to and store the copy_to field, you can obtain it from the index.

adyjayex · August 13, 2017, 11:20pm

So when I use an analyzer on a multi-field and set store: true, the stored field is the same as the _source field ?

I don't want the _source modified, I assumed using a sub-field and retrieving it using stored_fields would retrieve the terms processed by the analyzer if store is set to true

I'll look into copy_to as well, thanks!

jasontedor · August 14, 2017, 10:11am

No, stored fields does the same thing: it stores the original value of the field, not the result of the analysis chain. In fact, as a result of your question I looked closer at my copy_to suggestion and I regret to say that I am mistaken, what I suggested can not be done. Instead, you have to use my initial suggestion (do this client side up front, or in an ingest pipeline). Alternatively, you can do it client side when you fetch the field, or you can use a script field.

adyjayex · August 14, 2017, 12:14pm

Good to know! Thanks for helping me with this and also providing me with alternative suggestions!

jasontedor · August 14, 2017, 2:06pm

You are very welcome.

system · September 11, 2017, 2:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Possible to use truncate filter on keyword subfield to limit length? Elasticsearch	1	596	October 25, 2018
Using the Truncate filter on keywords Elasticsearch	6	2823	December 11, 2018
Is KEYWORD data type analyzed as well? Elasticsearch	3	1530	February 14, 2017
Truncate token filter fails on some strings Elasticsearch	2	557	March 16, 2017
Elasticsearch Retrieve token_count standard value from search Elasticsearch	3	467	January 14, 2020

Truncate keyword to specific length and store

Related topics