I am using Elasticsearch 5.1 for logging web logs. Because we use underscore to split the parameters in url, we need the elasticsearch tokenizer to split by underscore too. But default tokenizer of Elasticsearch does not support underscore.
I still need the default tokenizer functionality, and do not want overwrite the default Elasticsearch tokenizer. Would anyone tell me how to add underscore support for the default Elasticsearch tokenizer?
Unfortunately, the standard tokenzier is rather complex, so it would be difficult to replicate with the pattern tokenizer plus your modifications.
But you should know your data better than default algorithms. You can create one of the pattern tokenizers matching your data. In order for it to the default, you would need to create a default analyzer, using that tokenizer.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.