Is there a single place where we can find the default stopwords used by ES/Lucene for every single language available? I've found bits and pieces here and there but I can't locate a single place where the implemented default lists for every language are available and up to date.
You can set up ES to use external stopword list file(s) so you can add/remove words as you see fit with your data. Certainly I suggest you to start with the default list comes from ES until you see something that does not seem to work with your data, then try the custom list using external file.
Here is the link where you can get a decent set of stopword lists for different languages to start with: http://www.ranks.nl/stopwords
Thanks but my question was not about custom implementations of stopwords. It was purely a simple question about where I can find ES/Lucene's actual specific default lists for all the languages that they have defaults for.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.