Hello, I am having a problem using the "keep_types" filter with a "pattern" tokenizer, here is an example:
{
"tokenizer": {
"type": "pattern",
"pattern": "[()., _-]"
},
"filter": [
"lowercase",
"asciifolding",
{
"type": "keep_types",
"types": [ "<ALPHANUM>" ]
}
],
"text": [
"7002982065_8031949292_Bomba (Vácuo,pressão) - Suryha.pdf"
]
}
The result against the _analyze API is:
{
"tokens": []
}
If I remove the keep_types it works as intended.
I also noted it works fine if I use the "standard" analyzer, but in this case, it wouldn't tokenize the text in the desired way.
I am using version 6.8, but also tried in 7.5 with same results...
Any ideas?