Unexpected match when using edgeNgram filter

I encountered some unexpected matches when I set up a field with an analyzer that contains an edgeNgram filter.

This is how I set up the mappings and the settings:

PUT test-index
{
  "mappings": {
    "properties": {
      "name" : { "type" : "text", "analyzer" : "my_custom_analyzer" }
    }
  },
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "my_edge_gram"
          ]
        }
      },
      "filter": {
        "my_edge_gram": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 20
        }
      }
    }
  }
}

And then I added three documents like this:

PUT test-index/_doc/1
{
    "name": "Mario"
}

PUT test-index/_doc/2
{
    "name": "Maria"
}

PUT test-index/_doc/3
{
    "name": "Merle"
}

Then when I executed this query:

GET test-index/_search
{
    "query": {
        "match" : {
            "name" : "Mario"
        }
    }
}

I got all three documents.

I was under the impression that edgeNgram meant that a text field like "Mario" is split into "M", "Ma", "Mar", "Mari" and "Mario". So I thought when I look for "M" I would get all the documents, when I look for "Ma", "Mar" or "Mari", I would get the documents "Mario" and "Maria" and when I look for "Mario", I would only get the document "Mario".

It seems like there is something that I did not understand about edgeNgram. Can you explain it to me?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.