I want a TokenFilter that combines the tokens produced by mutliple other TokenFilters, so that I can create a single field that is tokenized multiple different ways. e.g. a filter section like:
"filter":{
"prefix_ngram": {
"min_gram": 1,
"max_gram": 15,
"type": "edgeNGram"
},
"word_splitter": {
"type": "pattern_capture",
"patterns": ["([a-z]+)([0-9.]+)","\\(([^)]+)\\)"],
"preserve_original": 1
},
"all_tokens": {
"type": "combine",
"filters": ["prefix_ngram", "word_splitter"],
"unique": true
}
}
I looked but couldn't find anything like this. Is it easy to create?
As a workaround, I've indexed with a document structure that has the same field repeated with multiple different analyzers, and then used a multi_match query, but it would be more elegant / efficient to store all the different tokenizations in a single field.