Hey Carlos,
You can use a pipeline to re-index data that already exists in Elastic. In a lot of the examples this pipeline is applied at ingest time; but you can run such a pipeline at any other time and on different source datasets.
Check out this tutorial for example. You can ignore the first step that generates the starting data (assume this is the data you already have in Elastic), and directly use the reindex
command:
POST _reindex?wait_for_completion=false
{
"source": {
"index": "test-data",
"size": 10
},
"dest": {
"index": "semantic-embeddings"
}
}
This will create a new index that also contains the newly generated semantic_text
field (you can also define the name for the target field or the data structure in the index mapping).