Exact matches with match query on analyzed field not showing on top result


(Shashank Reddy) #1

I have a data set like below

"hits": [
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "2",
            "_score": 1,
            "_source": {
               "name": "Duvvuri ram gopal reddy"
            }
         },
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "1",
            "_score": 1,
            "_source": {
               "name": "ram gopal reddy"
            }
         },
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "3",
            "_score": 1,
            "_source": {
               "name": "reddy ram gopal"
            }
         }
      ]

When i try to perform a match query with value as ram gopal reddy exact matched record is not showing on top. Query:

GET demotest/_search
{
    "query": {
        "match": {
           "name": "ram gopal reddy"
        }
    }
}

Result after execution of above query:

"hits": [
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "2",
            "_score": 0.8630463,
            "_source": {
               "name": "Duvvuri ram gopal reddy"
            }
         },
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "1",
            "_score": 0.7594807,
            "_source": {
               "name": "ram gopal reddy"
            }
         },
         {
            "_index": "demotest",
            "_type": "demotest",
            "_id": "3",
            "_score": 0.7594807,
            "_source": {
               "name": "reddy ram gopal"
            }
         }
      ]

How to get exact matched record on top in search results. Thanks


Exact matches with match query on analyzed field not showing on top
(David Pilato) #2

I'd use a bool query with multiple should clauses. One would be set with a phrase query and the other one as a match query.

I wrote a full (but complex example) here: https://gist.github.com/dadoonet/5179ee72ecbf08f12f53d4bda1b76bab

Should give something like:

GET oss/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "name": {
              "query" : "david",
              "boost": 8.0
            }
          }
        },
        {
          "match": {
            "name": {
              "query": "david"
            }
          }
        }
      ]
    }
  }
}

(Shashank Reddy) #3

I did not get desired result with above answer.
I posted my complete test here
and I want case insensitive search and not phrase search.


(David Pilato) #4

I don't understand. You said that you want your search to be case insensitive?

But this is the case here:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1.7260926,
    "hits": [
      {
        "_index": "demotest",
        "_type": "demotest",
        "_id": "9",
        "_score": 1.7260926,
        "_source": {
          "name": "Dulugunti Ram Gopal Reddy "
        }
      },
      {
        "_index": "demotest",
        "_type": "demotest",
        "_id": "2",
        "_score": 1.7260926,
        "_source": {
          "name": "Duvvuri ram gopal reddy"
        }
      },
      {
        "_index": "demotest",
        "_type": "demotest",
        "_id": "1",
        "_score": 1.5189614,
        "_source": {
          "name": "ram gopal reddy"
        }
      },
      {
        "_index": "demotest",
        "_type": "demotest",
        "_id": "3",
        "_score": 0.7594807,
        "_source": {
          "name": "reddy ram gopal"
        }
      }
    ]
  }
}

(Shashank Reddy) #5

i want ram gopal reddy record should come on top even if i search for Ram gopal reddy (R in caps).


(David Pilato) #6

Then add a term query as a new should clause with some boost. That should work.


(Shashank Reddy) #7

I have tried with that too, but did'nt got desired output.


(David Pilato) #9

May be this then:

DELETE demotest
PUT demotest
{
  "mappings": {
    "demotest": {
      "properties": {
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "text",
              "analyzer": "keyword"
            }
          }
        }
      }
    }
  }
}
POST /demotest/demotest/2
{
  "name": "Duvvuri ram gopal reddy"
}
POST /demotest/demotest/1
{
  "name": "ram gopal reddy"
}
POST /demotest/demotest/3
{
  "name": "reddy ram gopal"
}
POST /demotest/demotest/9
{
  "name": "Dulugunti Ram Gopal Reddy "
}

#search

GET demotest/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name.keyword": {
              "value": "ram gopal reddy"
            }
          }
        },
        {
          "match_phrase": {
            "name": {
              "query" : "ram gopal reddy"
            }
          }
        },
        {
          "match": {
            "name": {
              "query": "ram gopal reddy"
            }
          }
        }
      ]
    }
  }
}

(Shashank Reddy) #10

Working with query as ram gopal reddy.
But if i change query to Ram gopal reddy.
I am getting the same old results, not exact one on top.


(David Pilato) #11

Sure. It's a combination of things. You basically need to find the right analyzers for your case, apply them on different sub-fields then apply the right searches that you need within a bool query in should clauses as I shown you.

Now I believe you have all the information and you need to try to find how to combine all that to achieve exactly your use case.
Also you can play a bit with boost factors if needed.


(Shashank Reddy) #12

I tried adjusting boost values for match phrase, match and wildcard by keeping all three in bool query with should clause. but not getting desired output. I have been trying from past 6 days.


(Emmanuel Rouby) #13

Concerning the case insensitive thing, and for my use case,

I add a custom lowercase analyser on field "name" and then, when building the request, I lowercase the searched value before execution

for your use case, you could also split the searched value in order to have a request like that:
(ugly but it works..)

GET testindex/type1/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name": {
              "value": "ram gopal reddy"
            }
          }
        },
        {
          "bool": {
            "must": [
              {
                "wildcard": {
                  "name": "*ram*"
                }
              },
              {
                "wildcard": {
                  "name": "*gopal*"
                }
              },
              {
                "wildcard": {
                  "name": "*reddy*"
                }
              }
            ]
          }
        }
      ]
    }
  }
}

you can test that with a not_alanysed field, if it works for you, replace the not_analysed with a custom analyser for lowercase..

PUT testindex
{
  "mappings": {
    "type1": {
      "properties": {
        "name": {
          "type": "string",
          "index": "not_analyzed"
          }
        }
      }
    }
  }
}

(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.