Dec 11th, 2023: [EN] Relevant Search Combining ELSER and BM25 Text Queries

Kathleen_DeRusso · December 11, 2023, 8:00am

The Elastic Learned Spare EncodeR (ELSER) allows you to perform semantic search for more relevant search results. Sometimes, however, it’s more useful to combine semantic search results with regular keyword search results to get the best results possible. The question is, how to combine both text and semantic search results?

First, let’s look at a garden variety text query, using multi_match over certain fields. This search has the typical pitfalls of keyword search, namely that the keyword has to exist in some form in the document to be returned, and we don’t take the context of what users are searching for into account.

POST search-national-parks/_search
{
  "query": {
    "multi_match": {
      "query": "Where can I see the Northern Lights?",
      "fields": ["title", "description"]
    }
  },
  "_source": ["title"]
}

Now, let’s look at an ELSER query by itself:

POST search-national-parks/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "text_expansion": {
            "ml.inference.title_expanded.predicted_value": {
              "model_id": ".elser_model_2",
              "model_text": "Where can I see the Northern Lights?"
            }
          }
        },
        {
          "text_expansion": {
            "ml.inference.description_expanded.predicted_value": {
              "model_id": ".elser_model_2",
              "model_text": "Where can I see the Northern Lights?"
            }
          }
        }
      ]
    }
  },
  "_source": [
    "title"
  ]
}

The first way to combine these two queries is with a strategy known as linear boosting. In this example, we are boosting the text search results so that they have precedence. This may or may not be desirable based on the query that you’re running.

POST search-national-parks/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "text_expansion": {
            "ml.inference.title_expanded.predicted_value": {
              "model_id": ".elser_model_2",
              "model_text": "Where can I see the Northern Lights?",
              "boost": 1
            }
          }
        },
        {
          "text_expansion": {
            "ml.inference.description_expanded.predicted_value": {
              "model_id": ".elser_model_2",
              "model_text": "Where can I see the Northern Lights?",
              "boost": 1
            }
          }
        },
        {
          "multi_match": {
            "query": "Where can I see the Northern Lights?",
            "fields": [
              "title",
              "description"
            ],
            "boost": 4
          }
        }
      ]
    }
  },
  "_source": [
    "title"
  ]
}

Finally, we can also use Reciprocal Rank Fusion (RRF) to combine text search results with semantic results, and rescore the returned search results:

POST search-national-parks/_search
{
  "sub_searches": [
    {
      "query": {
        "multi_match": {
          "query": "Where can I see the Northern Lights?",
          "fields": [
            "title",
            "description"
          ]
        }
      }
    },
    {
      "query": {
        "text_expansion": {
          "ml.inference.title_expanded.predicted_value": {
            "model_id": ".elser_model_2",
            "model_text": "Where can I see the Northern Lights?"
          }
        }
      }
    },
    {
      "query": {
        "text_expansion": {
          "ml.inference.description_expanded.predicted_value": {
            "model_id": ".elser_model_2",
            "model_text": "Where can I see the Northern Lights?"
          }
        }
      }
    }
  ],
  "rank": {
    "rrf": {
      "window_size": 10,
      "rank_constant": 20
    }
  },
  "_source": [
    "title", "states"
  ]
}

These examples should help get you started on your journey to creating the most relevant search results for your use case!

Want to learn more, or just want to play around? Check out Search Labs for information and tutorials like this Search Tutorial to get started with building search solutions using vector search in Elasticsearch.

system · January 8, 2024, 8:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improving search results for misspelled queries with ELSER semantic search Elasticsearch	1	38	December 29, 2024
Dec 13th, 2024: [EN] Semantic, Vector, and Hybrid Search all in Kibana Console Advent Calendar	0	81	December 13, 2024
Basic Semantic Search or Semantic Search with ELSER Elasticsearch	2	205	April 15, 2024
Semantic search with the new semantic_text field Elasticsearch elastic-stack-machine-learning , vector-search	12	441	March 21, 2025
Performing semantic searches - ELSER Elasticsearch	3	253	March 30, 2024

Dec 11th, 2023: [EN] Relevant Search Combining ELSER and BM25 Text Queries

Related topics