We're currently on 1.6 with plans to move to 1.7 soon.
We want to incorporate a synonym token filter, but I was hoping to have a combination of sources - for example using the Wordnet synonym file as a baseline (english) and then providing some synonyms in a separate file that are unique to our business.
I have not been able to find any documentation for this sort of configuration, is the above possible?
The synonym token filter takes only a single file in the synonym_path parameter. I guess you could create an analyzer with multiple synonym token filters, each one pointing to a different path.
Thanks Colin. I'm a bit new still to elasticsearch, can I associate multiple token filters with my mapping? That seems like it might cause a lot of different terms to be indexed.
Or maybe I can just export the Wordnet synonym into a text file then manually edit/add the ones that I need?
right, so there are problems with using large synonym dictionaries (even if you just use one synonym filter). Have a look at this chapter of the Elasticsearch: The Definitive Guide book for more information on the advantages and disadvantages (including the sub pages on the right hand menu: Synonyms | Elasticsearch: The Definitive Guide [2.x] | Elastic
Yes, usually this is the best approach, see which synonyms you actually need and add them manually to keep the number of synonyms fairly low
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.