I am trying to use the E5 model to generate embeddings for some documents.
I used a reindex to generate the embeddings:
POST _reindex?wait_for_completion=false
{
"source": {
"index": "source_index",
"size": 50
},
"dest": {
"index": "destination_index",
"pipeline": "e5-test"
}
}
After almost 8 minutes I got this error:
"response": {
"took": 472742,
"timed_out": false,
"total": 1997,
"updated": 1749,
"created": 0,
"deleted": 0,
"batches": 35,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled": "0s",
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until": "0s",
"throttled_until_millis": 0,
"failures": [
{
"index": "destination_index",
"id": "324723984",
"cause": {
"type": "array_index_out_of_bounds_exception",
"reason": "Index 9 out of bounds for length 8"
},
"status": 500
}
]
}
I see that none of the embeddings were generated, apparently because of this single record that failed.
Is there a way for reindex to continue when a single document fails the embedding generation process?