Unexpected behavior when using match_bool_prefix with synonyms

I'm experiencing unexpected behavior when using match_bool_prefix together with a synonym_graph filter on the search analyzer.
I found a similar topic previously asked:
Prefix query not matching any documents — I used the example from that thread, and I encounter the same problem here.

PUT henkel-test
{
  "settings": {
    "analysis": {
      "filter": {
        "synonyms_filter": {
          "type": "synonym_graph",
            "synonyms": [
            "henke, henk"
          ],
          "updateable": true
        }
      },
      "analyzer": {
        "my_index_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        },
        "my_search_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "synonyms_filter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "text": {
        "type": "text",
        "analyzer": "my_index_analyzer",
        "search_analyzer": "my_search_analyzer"
      }
    }
  }
}


PUT henkel-test/_doc/1
{
  "text": "CT17 TRANSPARENT GRUNT 2L HENKEL"
}

POST henkel-test/_search
{
  "profile": true,
  "query": {
    "match_bool_prefix" : {
      "text" : "henk"
    }
  }
}

Searching for "hen" correctly matches "henkel".

Searching for "henk" does not match.

I tested this on:
Elasticsearch 8.15.2
Elasticsearch 9.2.1 — same result on both versions.

Is this expected behavior, or am i missing something? Since match_bool_prefix uses the last term for partial (prefix) matching, I expected: "henk" → would behave like a prefix and match "henkel"

Hello @gs_elastic

Welcome to the community!!

I also observed the same output as described in the post for 9.2.1 version.

On reviewing online using LLM i found below information related to this topic if it can be helpful which suggest this is not a bug but output is as expected :

Document indexed: token is henkel (index analyzer just lowercases).

  1. Search "hen":
    Analyzer yields token hen.
    match_bool_prefix makes a prefix/multi-term query like hen* → matches henkel.

  2. Search "henk":
    Search analyzer runs synonyms: henk → henke (because your synonyms list contains henke, henk).
    The query becomes a SynonymQuery with exact terms (henk and henke) — no prefix expansion on the synonym term — so it does not match the indexed henkel.

match_bool_prefix constructs prefix queries for the final token after analysis. When synonyms replace that token with other tokens that are not prefixes of the indexed token you want, you lose the prefix match. Either make sure the synonym expansion includes the indexed form (henkel) or apply synonyms at index time so the indexed tokens include the variant you search for.

Because "henke, henk" is bidirectional, when you search "henk":
Analyzer produces: henk
Synonyms produce: henk, henke
BUT no prefix query is applied after synonym expansion, so these become exact term queries, not henk*.
Your indexed token is henkel, which is not equal to henk or henke, so nothing matches.

Synonym Expansion: When using synonym_graph, the expanded synonyms are treated as separate tokens. These tokens are matched using term queries, not prefix queries.
Prefix Matching: The match_bool_prefix query applies prefix matching only to the last token produced by the analyzer. If the token graph contains multiple tokens at the same position, prefix matching may not work as intended.

Thanks!!

1 Like

Thank you for the detailed explanation — The part about how synonym_graph produces multiple tokens at the same position, preventing match_bool_prefix from applying prefix logic, makes sense.

It might still be useful if the official documentation mentioned this interaction explicitly, as it can be confusing when first encountered. I also noticed that the match query docs include a note about how synonyms interact with fuzziness, and a similar note here could help avoid confusion:
https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-match-query#query-dsl-match-query-fuzziness

1 Like