Genre Expansion in Elasticsearch 6.1

acorned · January 16, 2018, 12:39pm

Hello All!

I need some hierarchical structure of synonyms to search, as it described here:
https://www.elastic.co/guide/en/elasticsearch/guide/current/synonyms-expand-or-contract.html

When I search for "pet" I want to find documents with "cat" and "dog", and so on. Is it possible in Elastic 6.1? Could you give me an example?

dadoonet · January 16, 2018, 1:44pm

Yes. Use a Synonym TokenFilter in your analyzer. https://www.elastic.co/guide/en/elasticsearch/reference/6.1/analysis-synonym-tokenfilter.html

acorned · January 22, 2018, 1:20pm

I tried Synonym TokenFilter, but I can't make it hierarchical. For example, if I write something like this in Solr:

pet => cat, dog, bird
cat => kitten, kitty
dog => chow chow, malamute
bird => parrot, hawk

and try to search "pet" I'll find documents with "cat" and "dog" inside, but not with "chow chow". Of course I can write all the downgrade synonyms in one string, but if I try to add about 70_000 low level synonyms, Elasticsearch doesn't create index (I waited for hour, nothing has changed).

I read about Synonym Graph Token Filter that could be useful, but I can't find proper realization(

dadoonet · January 22, 2018, 1:44pm

I read about Synonym Graph Token Filter that could be useful, but I can't find proper realization(

Do you mean this? Synonym Graph Token Filter | Elasticsearch Reference [6.1] | Elastic

But anyway, I "think" that it should behave the same way as Solr does. But I'm not quite an expert on that part. May be @jimczi could tell more?

jimczi · January 22, 2018, 2:38pm

The example on the guide is a bit misleading. It's not a hierarchical structure, only one rule gets applied.
The idea of this example is to show how to do rewrite a single term into multiple terms. cat is rewritten to cat and pet and kitten is expanded to cat, pet, kitten. As you can see each rule in the example contains all expansions.
Now it becomes tricky when you mix indexation and querying. If you set a synonym filter with your example rules at indexing and query time then a document that contains pet would index cat, dog, bird and when querying for pet you would in fact search for cat, dog, bird. Though documents that contain dog are already rewritten into chow chow, malamute so dog would never match. The synonym_graph is not useful for this, you should revise your rules based on how the synonyms are applied.

system · February 19, 2018, 2:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Do synonyms chain? Elasticsearch	3	1215	December 27, 2017
Match query doesn't work well with Genre Expansion Synonyms Elasticsearch	3	724	September 27, 2019
Synonyms in Elastic search Elasticsearch	2	629	July 6, 2017
Synonym token filter question Elasticsearch	3	377	April 29, 2020
Multi-term synonyms: How can this be used in practice? Elasticsearch	6	3083	April 8, 2020

Genre Expansion in Elasticsearch 6.1

Related topics