Synonyms in a production environment

Hi all,

I'm working with one of our customers on implementing Elasticsearch as a replacement for their current search solution in their E-Commerce website. One of the features of the engine is that it must be able to handle searching with synonyms. Now adding synonyms support to Elasticsearch isn't that exciting, but one of the other requirements is that it must be possible to update synonyms dynamically. That's where we start to find some issues.

Synonyms can be handled in two ways, query-time, and index-time. Since synonyms will be updated regularly, query-time synonyms sound like the best way to go. Our client has quite a lot of data and updating all documents affected by a synonym change would probably be a performance killer.

Now if we go for query-time synonyms, the only thing we need to do is create a synonym filter which takes a 'path' parameter to a local synonym file on the machine where Elasticsearch is running. If we want to update this file, we need to push a new version to all machines in the cluster and restart the nodes one for one, as described in:

https://www.elastic.co/guide/en/elasticsearch/guide/current/using-stopwords.html#updating-stopwords

On a production environment, with multiple synonym changes/additions during the day, this doesn't really sound like a good solution.

I was wondering what kind of experience you all have with implementing/maintaining dynamically updateable synonyms in Elasticsearch for production environments.

Thanks in advance :slight_smile:

1 Like