Some beginner questions on optimizing search results

Hi there, hope you are well! I've been tasked with using the Relevance Tuning feature on app search to improve the quality of our search results.

As some background:

  • I am very much non-technical!
  • We have an app search database of ~300,000 grocery store products.
  • The main two fields that we use to optimize search are "product title" and "product category"

I had two main questions I was hoping someone could help me with:

1.) Many of our user searches are for groups of foods like "cookies." We thought it would make sense to weight the "product category" field (weight of 10) higher than "product title" field (weight of 2).

Here's what's confusing to me. When we search for "cookies" in the Relevance Tuner, a product that has "cookies" in both the product name and product category receives the same score as a product that only has "cookies" in the product category. How could this be considering cookies appears in two fields in one example and only one field in the other example?

2.) If we do a test query like "rice," a result like this is heavily prioritized by the system:

Rice Chex Rice Cereal, Gluten Free, Rice

In other words, since the word "rice" is included three times in the product title, it's HEAVILY weighted by the app search engine. This is at the expense of even looking in the "product category" field. For example, even though we have a "product category" field called "Grains, Rice & Dried Goods," none of the top results are products from that category since the system is heavily prioritizing the three "rice"s that are appearing in the product title. Is there any way to change this?

Thanks so much for any help - really do appreciate it!

Hi @xenawp314 ! Welcome to the Elastic community!

Question 1):

Relevance Tuning retrieves the final scores using weighted and boosted queries, depending on:

  • Weight for the different fields
  • Boosts for the field values

Besides that, score is naturally impacted by the significance of a term (how many times it appears in the document, how many in all the documents, etc).

Scoring is complex and can sometimes be counterintuitive. We would need to take a look into your data to understand what is causing a specific score. But from your description, probably having very different weights for product category will "shadow" the score obtained from the product title. I'd suggest using a more balanced approach; it will also benefit users who are looking for the exact product title instead of a category.

The advantage of the Query Tester is that you can check in real time what the impact of weights and boosts is for the query.

As a suggestion, you may add product category as a filter field to help users categorise the search results they get instead of relying on a direct search on the product category.

Search Relevance Tuning is part science, part art. It's difficult to fit all your use cases; it's important that you use App Search analytics to understand what the customers are looking for and what the expected results are, as well as business knowledge to prioritise and curate the search results.

Question 2:

When a specific product has the word repeated multiple times in a field, and the others not, it will be more relevant. The score will be impacted by the relative frequency of the term in that field for other documents.

I'd suggest to manually curate those product titles so they don't have as many repetitions. Maybe that description would fit into multiple fields:

  • Product Title: Rice Chex
  • Product Category: Rice Cereal
  • Allergies: Gluten Free
  • Ingredients: Rice

That way, you can provide a much richer search experience for the users (they can have multiple categories to search for, or drill down their search) as well as providing more accurate results.

Let us know about any other questions you might have!