How to remove irrelevant result when a perfect match is found?

Hi!

I have an index of lyrics with the following structure:

{
  "mappings": {
    "_doc": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "kuromoji"
        },
        "artist": {
          "properties": {
            "id": {
              "type": "long"
            },
            "name": {
              "type": "text",
              "analyzer": "kuromoji"
            }
          }
        }
      }
    }
  }
}

*I use kuromoji since it's mostly japanese lyrics

I want to create a search feature for our users where they can search songs with the combination of artist.name + title or just one of them, for example inputting "Bruno Mars" will output all Bruno Mars' song, or "Bruno Mars Just The Way You Are" should return only that song (or a very similar one)

I used this following query:

{
  "query": {
    "multi_match": {
      "type": "most_fields",
      "query": "Bruno Mars Just The Way You Are",
      "fields": [
          "title",
          "artist.name"
      ]
    }
  }
}

and it works, Bruno Mars' Just the way you are is on the top result.
The only issue is, other Bruno Mars' song is also returned:

"hits": [
  {
      "_index": "lyric",
      "_type": "_doc",
      "_id": "93818",
      "_score": 37.77198,
      "_source": {
          "artist": {
              "name": "Bruno Mars"
          },
          "title": "Just The Way You Are"
      }
  },
  {
      "_index": "lyric",
      "_type": "_doc",
      "_id": "93829",
      "_score": 25.52073,
      "_source": {
          "artist": {
              "name": "Bruno Mars, Lupe Fiasco"
          },
          "title": "Just The Way You Are (Ft.Lupe Fiasco) (Remix)"
      }
  },
  {
      "_index": "lyric",
      "_type": "_doc",
      "_id": "93825",
      "_score": 23.43954,
      "_source": {
          "artist": {
              "name": "Bruno Mars"
          },
          "title": "The Lazy Song"
      }
  },
  {
      "_index": "lyric",
      "_type": "_doc",
      "_id": "93823",
      "_score": 22.316628,
      "_source": {
          "artist": {
              "name": "Bruno Mars"
          },
          "title": "Talking To The Moon"
      }
  },
  {
      "_index": "lyric",
      "_type": "_doc",
      "_id": "90408",
      "_score": 22.242022,
      "_source": {
          "artist": {
              "name": "Bruno Mars"
          },
          "title": "Marry You"
      }
  }
]

hits[0] is a perfect match, while hits[1] is similar. So those two are the results I want while the others, although it's Bruno Mars' song but I don't want to include it in the result.

Is this possible to achieve using Elasticsearch?

I hope my question is clear enough, if I'm missing something please let me know!

Thank you!

I think that you could define a minimum_should_match as explained here: https://www.elastic.co/guide/en/elasticsearch/reference/6.6/query-dsl-minimum-should-match.html

Hi @dadoonet thank you for the suggestion, I'll try using minimum_should_match and report back later!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.