ES 5.4 synonyms and shingles don't seem to work together

I've applied the following synonym filter to an analyzer just on the search side (i.e. applied at search time, not index time):

  "synonym":{  
    "type":"synonym",
    "expand":true,
    "synonyms": [
      "uberground, underground"
    ]
  },

If I search with either the word uberground or underground I get a document matched with the phrase underground water, as expected.

If I create a bi-gram shingle filter as well and apply it at both index and search time, running a search now for uberground water I get the same document matched, as expected. However if I search for underground water I don't get a match anymore. if I change my mapping of synonyms to the following however:

  "synonym":{  
    "type":"synonym",
    "expand":true,
    "synonyms": [
      "uberground, underground => uberground, underground"
    ]
  },

The shingle based search works regardless of whether I search for uberground water or underground water. AFAIK All I've done here is explicitly define the synonyms as an expansion, rather than rely on the expand parameter to do this for me. And I've tried leaving the parameter out (which should default to true) as well as putting the true in quotes. For some reason the expand parameter does work as expected when just searching individual words but then seems to disable when searching shingles. I've also noticed that adding another mapping into the synonym definitions causes additional problems:

  "synonym":{  
    "type":"synonym",
    "expand":true,
    "synonyms": [
      "uberground, underground => uberground, underground",
      "sherbert, water => sherbert, water"
    ]
  },

Now when I search against the bi-gram shingles with the synonym filter applied to the query, I don't get any matches, no matter which combination of words I put in the query. I guess this has something to do with both members of the shingle being synonyms. I expect to get matches on any bigram with a first word from the first synonym set and a second word from the second synonym set, since I know underground water appears in the document. Am I doing something wrong, are my expectations wrong or is this a bug?

This is a known problem, and it is still not fully resolved. A number of Lucene filters can't consume graphs as their inputs.

There is currently active work being done on developing a fixed shingle filter, and also an idea to have a sub-field for indexing shingles.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.