Search as you type with 100k rows, help scoring

Hi All,

I have an index that contains drug names (100k of them),I'm using a react front end to search as you type which is working quite well except the scoring coming back is not what I expect despite endless fiddling.

The following are returned when searching for "omep" with the same score;

  • Esomeprazole 20mg gastro-resistant granules sachets (Alliance Healthcare (Distribution) Ltd) 28 sachet
  • Omeprazole 20mg gastro-resistant tablets 7 tablet

(there are actually lots of results with the same score, i've just used these two to demonstrate with ease)

I'm hoping that the second line can be scored higher than the first line. I'm designing the search so the user can type anything (like google) so if i go with any type of prefix then if they type 20mg first for example then it wouldn't work.

Could someone offer some guidance, i've been tweaking for weeks and i'm no closer.

My index setup,

{
  "product" : {
    "settings" : {
      "index" : {
        "max_ngram_diff" : "20",
        "number_of_shards" : "5",
        "provided_name" : "product",
        "creation_date" : "1658446282055",
        "analysis" : {
          "analyzer" : {
            "autocomplete_filter" : {
              "filter" : [ "lowercase" ],
              "type" : "custom",
              "tokenizer" : "edge_ngram"
            }
          },
          "tokenizer" : {
            "edge_ngram" : {
              "type" : "ngram",
              "min_gram" : "1",
              "max_gram" : "20"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "RceygODWR7CARoQXvSYPqg",
        "version" : {
          "created" : "135238227"
        }
      }
    }
  }
}

{
  "product" : {
    "mappings" : {
      "properties" : {
        "appId" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "isNMP" : {
          "type" : "long"
        },
        "productName" : {
          "type" : "text",
          "analyzer" : "autocomplete_filter",
          "search_analyzer" : "standard"
        },
        "sortOrder" : {
          "type" : "long"
        },
        "title" : {
          "type" : "text",
          "norms" : false
        },
        "vppId" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

My search;

GET product/_search
{
  "sort": { "_score": "desc" },
  "query": {
    "multi_match": {
      "query": "omep",
      "type": "bool_prefix",
      "fields": [
        "productName",
        "productName._2gram",
        "productName._3gram",
        "productName._4gram"
      ]
    }
  }
}

Can anyone offer any advice on this, still struggling

I'm not too familiar with this kind of use case (so take my advice with some caution), but I'm kind of curious as to why for your productName field, you're using "type": "text" mapping, rather than "type": "search_as_you_type" (Search-as-you-type field type | Elasticsearch Guide [8.3] | Elastic).

You could then also leverage using "fields": ["productName._index_prefix"] which I would assume should help getting you a higher score for what you're looking for.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.