Non-standart analizer/tokenizer


(Eugene Prokopiev) #1

Hi,

What is the best way to configure non-standart analizer/tokenizer: split not only my whitespaces but even with underscores, dashes and slashes, exclude numbers - so only words with more than 3 letter without numbers and in lower case must stay?


(Nik Everett) #2

If any of the tokenizers here will do then you can just configure them. The Pattern tokenizer lets you define a regex and so its super flexible. It might be the best thing in your case.


(system) #3