Specifying analyzer for the synonym parser

aclowkey · May 16, 2018, 2:48pm

Hi,

I'm designing an analysis chain which has shingle and synonym graph filter, and other lowercase, stemmer,etc.

The purpose is to map tokens to custom dictionary and support overlapping tokens.
For example.
Dictionary:

red => color_1
red shift => company_1

Query:

Red shift

Desired analysis:

[color_1,company_1]

Now the issue is, when Elasticsearch parses the synonym list, it uses the same analyzer to analyze the terms in the dictionary . Link to source:

The shingle token filter produces n-grams which have position increment 0. Link to source

The SynonymMap throws exceptions on for tokens which have increment value != 1. Link to source

I tried to workaround this issue by Lucene code. and I did this by using a different analyzer to parse the synonyms.

system · June 13, 2018, 2:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shingles filter with synonym filter? Elasticsearch	1	1285	July 6, 2017
How to generate shingles before synonyms filter? Elasticsearch	3	341	September 14, 2022
Fuzzy searching on shingles filter getting problem for search Elasticsearch	1	408	November 9, 2018
Fuzzy searching on shingles filter getting problem Elasticsearch	1	634	November 6, 2018
Synonym token graphs and Shingles don't play well together Elasticsearch	2	618	June 4, 2020