Use ELSER on data already in elastic

Hey Carlos,

You can use a pipeline to re-index data that already exists in Elastic. In a lot of the examples this pipeline is applied at ingest time; but you can run such a pipeline at any other time and on different source datasets.

Check out this tutorial for example. You can ignore the first step that generates the starting data (assume this is the data you already have in Elastic), and directly use the reindex command:

POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "test-data",
    "size": 10 
  },
  "dest": {
    "index": "semantic-embeddings"
  }
}

This will create a new index that also contains the newly generated semantic_text field (you can also define the name for the target field or the data structure in the index mapping).

Here is another notebook example

Or a similar question with some more relevant code examples