Specifying analyzer for the synonym parser

Hi,

I'm designing an analysis chain which has shingle and synonym graph filter, and other lowercase, stemmer,etc.

The purpose is to map tokens to custom dictionary and support overlapping tokens.
For example.
Dictionary:

red => color_1
red shift => company_1

Query:

Red shift

Desired analysis:

[color_1,company_1]

Now the issue is, when Elasticsearch parses the synonym list, it uses the same analyzer to analyze the terms in the dictionary . Link to source:

The shingle token filter produces n-grams which have position increment 0. Link to source

The SynonymMap throws exceptions on for tokens which have increment value != 1. Link to source

I tried to workaround this issue by Lucene code. and I did this by using a different analyzer to parse the synonyms.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.