ElasticSearch Match Phrase Query Not returning expected results

I have a document with field: "title1" as: "A380 Familiarization".

When I run this query: the search results are empty. The document is not returned as part of the search query results.

GET index/_search
{
    "from": 0,
    "size": 10,
    "_source":
    [
        "title1"
    ],
    "query":
    {
        "bool":
        {
            "must":
            {
                "bool":
                {
                    "should":
                    [
                        {
                            "multi_match":
                            {
                                "query": "A380 Familiarization",
                                "analyzer": "search_analyzer1",
                                "type": "phrase",
                                "fields":
                                [
                                    "title1"
                                ]
                            }
                        }
                    ]
                }
            }
        }
    },
    "sort":
    [
        {
            "_score": "desc"
        }
    ]
}

The field title1 is indexed using a different analyzer:

  "title1": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword"
            }
          },
          "analyzer": "search_analyzer2"
        }

Here is the definition of search_analyzer2:

"search_analyzer2": {
             "type": "custom",
             "tokenizer": "standard",
             "filter": [
               "icu_folding",
               "word_delimiter_filter",
               "edge_ngram_filter"
             ],
             "char_filter": [
               "html_strip"
             ]
           }

And, here is the definition of search_analyzer1:

  "search_analyzer1": {
             "type": "custom",
             "tokenizer": "standard",
             "filter": [
               "icu_folding"
             ],
             "char_filter": [
               "html_strip"
             ]
           }

When I run analyze command using both the analyzers, there are overalapping tokens. Both return A380 but still the document is not returned.

GET index/_analyze
{
  "analyzer": "search_analyzer1",
  "text": "A380"
}

GET index/_analyze
{
  "analyzer": "search_analyzer2",
  "text": "A380"
}

Please advise why the document is not returned in the search results?

What is the output of both _analyze calls?

Your search string is not what you showed the analysis for. How is this analyzed using both analyzers? How is the indexed string you are looking to match analyzed?

Here is the analyze result for analyzer2:

GET index/_analyze
{
  "analyzer": "search_analyzer2",
  "text": "A380 Familiarization"
}

{
  "tokens": [
    {
      "token": "a",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "a3",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "a38",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "a380",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "a",
      "start_offset": 0,
      "end_offset": 1,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "3",
      "start_offset": 1,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "38",
      "start_offset": 1,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "380",
      "start_offset": 1,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "f",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "fa",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "fam",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "fami",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "famil",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "famili",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familia",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiar",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiari",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiariz",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiariza",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiarizat",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiarizati",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiarizatio",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "familiarization",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 2
    }
  ]
}

And here is the analyze result for analyzer1:

GET index/_analyze
{
  "analyzer": "search_analyzer1",
  "text": "A380 Familiarization"
}

{
  "tokens": [
    {
      "token": "a380",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "familiarization",
      "start_offset": 5,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

What if you remove this line?

"type": "phrase",

If I remove the "type": "phrase", I get the search results.
The thing is: I need it to work with type: phrase.
phrase means exact match. I am using exact terms when searching then why the search results are not getting returned?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.