Hi,
What is the best way to configure non-standart analizer/tokenizer: split not only my whitespaces but even with underscores, dashes and slashes, exclude numbers - so only words with more than 3 letter without numbers and in lower case must stay?
Hi,
What is the best way to configure non-standart analizer/tokenizer: split not only my whitespaces but even with underscores, dashes and slashes, exclude numbers - so only words with more than 3 letter without numbers and in lower case must stay?
If any of the tokenizers here will do then you can just configure them. The Pattern tokenizer lets you define a regex and so its super flexible. It might be the best thing in your case.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.