How to Use Custom Embeddings (Ollama) for Hybrid Search in Elasticsearch?

I'm trying to combine lexical search and semantic search using the RRF (Reciprocal Rank Fusion) algorithm, as described in the Elasticsearch documentation:

Since my data is in Indonesian, I need to use a custom embedding model. In my case, I'm using Ollama for generating embeddings. However, I haven't been able to find any tutorials on how to integrate a custom embedding model like this into Elasticsearch for hybrid search.

Could anyone point me in the right direction or share an example?

Hey Muhammad, welcome back!

I think there's a bit of a disconnect with the usage of the semantic_text field here for what you want to accomplish. The semantic_text field you linked is a specific way to quickly implement semantic search but it is tied to ELSER as an embedding model. As far as I know there is no way to plug in your own model. What you want to do with combining lexical and semantic search is still possible with your own embedding model, it just takes a few extra steps.

  1. Generate embeddings using your preferred model.
  2. Index those embeddings into Elasticsearch (Use the type: dense_vector).
  3. When you go to query, generate an embedding for your query using the same model.
  4. Run a knn search (This is effectively your vector/semantic search).
  5. Combine the results under a standard match query using RRF.

Here's some resources to help you get started:

Good luck! Let me know if you get stuck.