Match phrase prefix query not working as expected

According to the documents:

Match phrase prefix query
Returns documents that contain the words of a provided text, in the same order as provided. The last term of the provided text is treated as a prefix, matching any words that begin with that term.

The following queries doesn't work with matching single character, but matches with 2+ characters, which contradicts with the example provided in the documents here -> Match phrase prefix query

GET shakespeare/_mapping/field/text_entry

---------- OUTPUT----------
{
  "shakespeare" : {
    "mappings" : {
      "text_entry" : {
        "full_name" : "text_entry",
        "mapping" : {
          "text_entry" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
  }
}
POST shakespeare/_search
{
  "_source": "text_entry", 
  "size": 100,
  "query": {
    "match_phrase_prefix": {
      "text_entry": "farewell g"
    }
  }
}

---------- OUTPUT----------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

POST shakespeare/_search
{
  "_source": "text_entry",
  "size": 100, 
  "query": {
    "match_phrase_prefix": {
      "text_entry": "farewell ge"
    }
  }
}

---------- OUTPUT----------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 641.55804,
    "hits" : [
      {
        "_index" : "shakespeare",
        "_type" : "doc",
        "_id" : "44803",
        "_score" : 641.55804,
        "_source" : {
          "text_entry" : "Farewell, gentle cousin."
        }
      },
      {
        "_index" : "shakespeare",
        "_type" : "doc",
        "_id" : "65901",
        "_score" : 559.4885,
        "_source" : {
          "text_entry" : "Farewell, gentle mistress: farewell, Nan."
        }
      },
      {
        "_index" : "shakespeare",
        "_type" : "doc",
        "_id" : "83805",
        "_score" : 525.8544,
        "_source" : {
          "text_entry" : "Farewell, good cousin; farewell, gentle friends."
        }
      }
    ]
  }
}

Just to mention, the default standard analyzer is used; the "g" character is not discarded by the analyzer.

POST shakespeare/_analyze
{
  "field": "text_entry",
  "text": "farewell g"
}

---------- OUTPUT----------
{
  "tokens" : [
    {
      "token" : "farewell",
      "start_offset" : 0,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "g",
      "start_offset" : 9,
      "end_offset" : 10,
      "type" : "<ALPHANUM>",
      "position" : 1
    }
  ]
}
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.