Create keyword string type with custom analyzer in 5.3.0


#1

I have a string I'd like to index as keyword type but with a special comma analyzer:
For example:

"San Francisco, Boston, New York"

should be both indexed and aggregatable at the same time so that I can split it up by buckets. In pre 5.0.0 the following worked:
Index settings:

 {
         'settings': {
             'analysis': {
                 'tokenizer': {
                     'comma': {
                         'type': 'pattern',
                         'pattern': ','
                     }
                 },
                 'analyzer': {
                     'comma': {
                         'type': 'custom',
                         'tokenizer': 'comma'
                     }
                 }
             },
         },
 }

with the following mapping:

{
        'city': {
            'type': 'string',
            'analyzer': 'comma'
        },
}

Now in 5.3.0 and above the analyzer is no longer a valid property for the keyword type, and my undestanding is that I want a keyword here. How do I specify an aggregatable, indexed, searchable text type with custom analyser?


(Simon Willnauer) #2

if you specify an analyzer that produces more than one token you have to use text type since it's not a keyword. if you wanna use a normalizer you can use a keyword type but then you need to split your tokens in the client and send a list ("San Francisco, Boston, New York" -> ["San Francisco", "Boston", "New York"]) Since keywords don't allow tokenization. If you use text as a type you can't use docvalues anymore and for aggregations all terms will be loaded into memory.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.