Dec 25th, 2025: [EN] Eat something healthier at X-mas

I hope during the holidays you folks are also eating healthy stuff, not just sweet cakes :wink:

Let’s say you wanted to buy some fruit in advance, you may not know all the names, you may not know what kind of fruit you actually would like to eat, the store has many things in its inventory or (just like me) you’re spending the holidays abroad.

What could help here, is a nice, low-effort, multi-lingual semantic search.

If you’re using the Elastic Cloud Serverless, you can rely on many things there, which weren’t necessarily one or two years ago, like semantic_text, EIS (Elastic Inference Service) or a multilingual dense vector model from Jina, which is enabled in EIS by default and doesn’t require your GPU to sweat or you to pre-plan ML nodes.

Let’s say the index the store keeps its inventory is really, really simple (we skip the name, SKUs and other stuff just for the sake of simplicity).

PUT inventory
{
  "mappings": {
    "properties": {
      "item": {
        "type": "semantic_text",
        "inference_id": ".jina-embeddings-v3"
      }
    }
  }
}

Then, let’s seed it with some items that can be purchased:

POST inventory/_bulk?refresh=true
{ "index": { } }
{ "item": "cherries 🍒" }
{ "index": { } }
{ "item": "train 🚆" }
{ "index": { } }
{ "item": "bananas 🍌" }
{ "index": { } }
{ "item": "computer 💻" }
{ "index": { } }
{ "item": "apple 🍎" }
{ "index": { } }
{ "item": "framboises 🍓" }
{ "index": { } }
{ "item": "der Apfel 🍏" }
{ "index": { } }
{ "item": "tomato 🍅" }
{ "index": { } }
{ "item": "das Auto 🚗" }
{ "index": { } }
{ "item": "bicycle 🚲" }
{ "index": { } }
{ "item": "naranjas 🍊" }

Please note that in the inventory we keep stuff from all departments, and it’s also in English, French, German and Spanish.

We should be able to see all items after we run POST inventory/_search, in a random order.

However, when I’d like to eat some fruit, which in Polish is “owoce” (that’s plural BTW), then all I need is:

POST inventory/_search
{
  "query": {
    "match": {
      "item": "owoce" // this stands for "fruit" in Polish
    }
  }
}

What we get in return is the following:

{
  "took": 251,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": 0.6704586,
    "hits": [
      {
        "_index": "inventory",
        "_id": "8EtNK5sBRerpcHC7zVrq",
        "_score": 0.6704586,
        "_source": {
          "item": "cherries 🍒"
        }
      },
      {
        "_index": "inventory",
        "_id": "9EtNK5sBRerpcHC7zVrr",
        "_score": 0.6327668,
        "_source": {
          "item": "apple 🍎"
        }
      },
      {
        "_index": "inventory",
        "_id": "-ktNK5sBRerpcHC7zVrr",
        "_score": 0.61157316,
        "_source": {
          "item": "naranjas 🍊"
        }
      },
      {
        "_index": "inventory",
        "_id": "8ktNK5sBRerpcHC7zVrr",
        "_score": 0.6047706,
        "_source": {
          "item": "bananas 🍌"
        }
      },
      {
        "_index": "inventory",
        "_id": "9UtNK5sBRerpcHC7zVrr",
        "_score": 0.60331476,
        "_source": {
          "item": "framboises 🍓"
        }
      },
      {
        "_index": "inventory",
        "_id": "9ktNK5sBRerpcHC7zVrr",
        "_score": 0.5917518,
        "_source": {
          "item": "der Apfel 🍏"
        }
      },
      {
        "_index": "inventory",
        "_id": "90tNK5sBRerpcHC7zVrr",
        "_score": 0.5634274,
        "_source": {
          "item": "tomato 🍅"
        }
      },
      {
        "_index": "inventory",
        "_id": "-UtNK5sBRerpcHC7zVrr",
        "_score": 0.50522983,
        "_source": {
          "item": "bicycle 🚲"
        }
      },
      {
        "_index": "inventory",
        "_id": "80tNK5sBRerpcHC7zVrr",
        "_score": 0.5001138,
        "_source": {
          "item": "computer 💻"
        }
      },
      {
        "_index": "inventory",
        "_id": "-EtNK5sBRerpcHC7zVrr",
        "_score": 0.48864484,
        "_source": {
          "item": "das Auto 🚗"
        }
      }
    ]
  }
}

This tells us a few things:

  • Creating and running semantic search is waaaaay easier these days, compared to what it was a few years and versions ago; combining semantic_text and models running in EIS makes things really easy: no need to install the model, no need to worry about planning capacity, no multiple network roundtrips to get the embeddings (both to store and to search), and so on.
  • If you have a multi-language model, that helps a lot and saves you translation efforts.
  • We know that tomato :tomato: is a fruit, but maybe we shouldn't add it to a fruit salad :wink:

That's all for today. I wish you healthy diet and healthy clusters :slight_smile:

2 Likes