I used whitespace tokenizer with lowercase filter. but it turned out that it took a longer time than standard analyzer. But I need to split sentence only by white spaces and I also need to use lowercase filter. Standard analyzer is very right for that. But the problem is that it also removes special characters.
I need to search for data containing special characters.
Standard analyzer has stop words options. But I think it just removes terms that matches.
I hope you can help me with this.
Thank you in advance!
I use default standard analyzer and 1 TB per index. When I search for data from the last five minutes, it takes a minute. But when I changed to whitespace analyzer, it took a longer time (about 20 minutes, I guess). So I had to revert it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.