Use ELSER on data already in elastic

iulia · February 3, 2025, 11:53am

Hey Carlos,

You can use a pipeline to re-index data that already exists in Elastic. In a lot of the examples this pipeline is applied at ingest time; but you can run such a pipeline at any other time and on different source datasets.

Check out this tutorial for example. You can ignore the first step that generates the starting data (assume this is the data you already have in Elastic), and directly use the reindex command:

POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "test-data",
    "size": 10 
  },
  "dest": {
    "index": "semantic-embeddings"
  }
}

This will create a new index that also contains the newly generated semantic_text field (you can also define the name for the target field or the data structure in the index mapping).

Here is another notebook example

Or a similar question with some more relevant code examples

Topic		Replies	Views
Performing semantic searches - ELSER Elasticsearch	3	304	March 30, 2024
ELSER - use the model outside of a pipeline Elasticsearch	4	632	September 27, 2023
Using reindex to generate embeddings from nested field Elasticsearch elastic-stack-machine-learning , ingest-pipeline	1	93	February 3, 2025
How can I avoid existing semantic query slowdown during re-indexing using ELSER model in Elastic Search V8 Elasticsearch	5	177	July 5, 2024
Using ELSER for multiple fields Elasticsearch elastic-stack-machine-learning	2	973	September 21, 2023

Use ELSER on data already in elastic

Related topics