I want to filter the stop words from the term aggregation, so I defined an analyzer with a custom stop word list (around 1000 words) and applied it to my index.
If I query the index to do a term aggregation I get a result where the stopwords are dominating the result.
And the analyzer seems to work as it should. But in the term aggregation the stopwords are still shown.
Or should I really use the exclude parameter in the term aggregation for the complete 1000 stopword list for every query? This can't be the way to go...
The aggregation should only work on the data present in the field.
Was the field indexed with that analyzer? Use the API to determine what the
actual mapping is. When using the analyze API, use the field parameter and
not the analyzer param, so that the actual mapped analyzer is used.
Ok, I think I found the error.
Thanks for pointing me into the mapping direction.
It was just a simple "}" error.
The mappings configuration was placed inside of the settings object...
Which is why I always tell people to use the API to find out what the
mapping is and NOT what they think the mapping is.
When you used the analyze API, you specified the analyzer to use. Instead,
you specify the field you want to use so that the exact analyzer defined in
the mapping is used. Look at the last example:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.