Shingles filter with synonym filter?

Hello!

I'm trying to use a shingle filter with the following settings:

"my_shingle_filter" : {
    "type" : "shingle",
    "min_shingle_size" : "2",
    "max_shingle_size" : "5",
    "output_unigrams" : "true"
}

I have a synonym filter with these settings:

"street_synonym" : {
    "type" : "synonym",
    "synonyms_path" : "analysis/synonyms.txt"
},

These are used in a custom analyzer:

"my_shingle_analyzer" : {
    "type" : "custom",
    "stopwords" : "_english_",
    "filter" : [
        "lowercase",
        "my_shingle_filter",
        "street_synonym"
    ],
    "tokenizer" : "standard"
}

When I go to test the analyzer, the synonyms are working, but only for single tokens, not bigrams, trigrams, etc.

110 5th Avenue would get expanded to 110, 110 5th, 110 5th avenue, 5th, Fifth, 5th avenue, avenue.

How can set this analyzer up so that I also end up with 110 Fifth and Fifth avenue? Is this even possible? 5th -> Fifth is in my synonyms file.

1 Like