How should I specify the stop-words in the stopwords.txt file? Just a
word in a line, or somehow else?
Also, I don't care which language users will use to index data, so if
I'd put stopwords from different languages into the same file, it
should be no problem, but should I use just UTF-8 encoding, or should
I use encoding like we use in .properties files, e.q. "de art
\u00edculos"?
How should I specify the stop-words in the stopwords.txt file? Just a
word in a line, or somehow else?
Also, I don't care which language users will use to index data, so if
I'd put stopwords from different languages into the same file, it
should be no problem, but should I use just UTF-8 encoding, or should
I use encoding like we use in .properties files, e.q. "de art
\u00edculos"?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.