Combine TokenFilter?


(jonm) #1

I want a TokenFilter that combines the tokens produced by mutliple other TokenFilters, so that I can create a single field that is tokenized multiple different ways. e.g. a filter section like:

"filter":{
    "prefix_ngram": {
        "min_gram": 1,
        "max_gram": 15,
        "type": "edgeNGram"
    },
    "word_splitter": {
        "type": "pattern_capture",
        "patterns": ["([a-z]+)([0-9.]+)","\\(([^)]+)\\)"],
        "preserve_original": 1
    },
    "all_tokens": {
        "type": "combine",
        "filters": ["prefix_ngram", "word_splitter"],
        "unique": true  
    }
}

I looked but couldn't find anything like this. Is it easy to create?

As a workaround, I've indexed with a document structure that has the same field repeated with multiple different analyzers, and then used a multi_match query, but it would be more elegant / efficient to store all the different tokenizations in a single field.


(system) #2