Combine TokenFilter?

jonm · September 17, 2013, 11:59pm

I want a TokenFilter that combines the tokens produced by mutliple other TokenFilters, so that I can create a single field that is tokenized multiple different ways. e.g. a filter section like:

"filter":{
    "prefix_ngram": {
        "min_gram": 1,
        "max_gram": 15,
        "type": "edgeNGram"
    },
    "word_splitter": {
        "type": "pattern_capture",
        "patterns": ["([a-z]+)([0-9.]+)","\\(([^)]+)\\)"],
        "preserve_original": 1
    },
    "all_tokens": {
        "type": "combine",
        "filters": ["prefix_ngram", "word_splitter"],
        "unique": true  
    }
}

I looked but couldn't find anything like this. Is it easy to create?

As a workaround, I've created a document structure that has the same field repeated with multiple different analyzers, and then used a multi_match query, but it would be more elegant / efficient to store all the different tokenizations in a single field.

Topic		Replies	Views
Combine TokenFilter? Elasticsearch	1	245	July 6, 2017
Merge tokens (terms) after the tokenisation Elasticsearch	2	250	May 17, 2023
How to combine all tokens into one? Elasticsearch	11	2580	September 3, 2018
Ngram and edgeNgram combined for _all field; or different token filters per field for _all Elasticsearch	1	582	July 6, 2017
Word Delimiter Graph Token + Synonym Graph Token Elasticsearch	1	989	August 13, 2021

Combine TokenFilter?

Related topics