Update synonym in real time

Background:
For all this while, I always thought that, if there is any changes in the synonym file, you will need to reindex the whole index. Sure, if the index size is small, the process is not that slow, but the process itself is tedious and there will be down time.

So question:

  1. Is my understanding wrong in this case? Means there is easier way to reflect the synonym changes for my index? Without the need to reindex.

  2. There is a future requirements in which instead of updating the rules in the file manually, perhaps the changes can be made from DB or even UI instead. Can this be done? If this is possible, can enlighten or advice me on how to do this?

  3. I believe this have something to do with Query or Index Time synonym expansion, with similar question arise before at here: The question, but does this still apply for current ES ver.5, 6 or even 7? As the definitive guide is way back at ver.2, so I a bit afraid that it already outdated.

Thank you.

Hi,

Good questions, in general, if you use synonyms in your analysis chain at index time this means Elasticsearch is writing whatever tokens the synonym filter adds/changes to the index, so in order to change that you will need to reindex. However, most of the use cases don't require index time synonym expansion, but most of the time its enough to apply synonyms in your query. This does not require re-indexation (because no document is changed).
In order to change the synonyms you still need to change the files containing the dictionary on all nodes though (loading from external sources or changing from UI is currently not possible but surely on the roadmap). Since version 7.3 there is an API to reload search analyzers which you can use to reload the synonyms after changing them on disk. Before version 7.3 reloading the analyzer required to close and reopen the index.
Yet another alternative would be to use query expansion on the client side, so your client logic would intercept the user query before sending it to elasticsearch and add alternative search terms as e.g. OR clauses.
Hope this helps.

Thanks for the reply.

Here is the thing, I read that query-time indexing actually will slow down search performance, hence the pain point of mine:

a) Choose convenience of update synonym on the fly while perhaps sacrificing potential performance...

Or

b) Choose tedious of reindexing but good performance...

This pro and cons still exist on ES ver.7 right?

While there is a little more work to do for query expansion at the search side, under normal circumstances I think this should be negligible. I think this statement definitely should be tested with some real data and workload, its not a general concern that I heard a lot about tbh.

I think I can conclude in this:

If not much changes needed for synonym, then perhaps I can opt for the index-time synonym whereby I can re-index per 6 months etc., while for those index which require frequent synonym changes, maybe query-time synonym is the way to go.

Am I right?

I had marked your initial answer as solution, thanks!

Sound right, it just always good to keep the two options in mind and know about their trade-offs.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.