Match phrase prefix query stopped working for multiple terms after 7.10.2

Issue: search using match_phrase_prefix does not return results for queries which consist of multiple terms, e.g. "Additional HTTP".

ElasticSearch 7.10.2: document is returned when searching "Additional HTTP"
ElasticSearch 7.11.1, 7.11.2, 7.12.1: no documents returned when searching "Additional HTTP"; document is returned when searching "Additional" (single term).

Minimal example:

Create index with custom analyzer:

PUT /test
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas" : "0",
    "analysis" : {
          "analyzer" : {
            "custom_analyzer" : {
              "filter" : [
                "lowercase"
              ],
              "type" : "custom",
              "position_increment_gap" : "0",
              "tokenizer" : "whitespace"
            }
          }
        }
  },
  "mappings": {
    "dynamic" : "strict",
    "date_detection" : false,
    "numeric_detection" : false,
    "properties": {
      "sentences" : {
          "dynamic" : "false",
          "properties" : {
            "text" : {
              "type" : "text",
              "analyzer" : "custom_analyzer",
              "index_prefixes" : {
                "min_chars" : 1,
                "max_chars" : 10
              }
            }
          }
        }
    }
  }
}

Add a document:

PUT test/_doc/Search-Test#1
{
          "sentences" : [
            {
              "text" : "4/30/2021"
            },
            {
              "text" : "RFC 6585 - Additional HTTP Status Codes"
            },
            {
              "text" : "Internet Engineering Task Force (IETF)"
            }
          ]
        }

Perform a search:

GET /test/_search
{
  "query": {
    "match_phrase_prefix": {
      "sentences.text": {
        "query": "Additional HTTP"
      }
    }
  }
}

UPD Looks like "index_prefixes" field in index mappings causes query to stop working. If I remove it, search returns results, but I can't imagine why.

1 Like

This looks like an issue then. I opened Match phrase query stopped working after 7.10.2 · Issue #72885 · elastic/elasticsearch · GitHub

Hi @Sleepy_Panda, thanks for opening this. It's definitely a bug, and it's to do with how we handle the position increment gap - specifically, we're not correctly preserving the gap value for the prefix accelerator field if it was set on the root analyzer. You should be able to work around this by setting the position increment gap directly on the text field, like so:

"text" : {
    "type" : "text",
    "analyzer" : "custom_analyzer",
    "position_increment_gap" : 0,
    "index_prefixes" : {
        "min_chars" : 1,
        "max_chars" 10
    }
}

Incidentally, have you found that changing the index_prefixes values from their default settings of {2, 5} has made a difference? Particularly on the lower end, prefix queries with a single character get rewritten to use simple wildcards and should still perform well - eg, "text:a*" gets rewritten to "text._secret_prefix_field:(a || a?)" which has a very low expansion cost and uses a lot less disk space. And stored prefixes beyond 5 characters for most english text doesn't gain you a great deal either, at least in my experiments.

  • Alan W

This has been fixe in 7.13.1 and above... see Search analyzer should default to configured index analyzer over default by romseygeek · Pull Request #73359 · elastic/elasticsearch · GitHub

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.