Elasticsearch reindex with ELSER pipeline succeeds but only generates embeddings for fraction of documents - no failures reported

Roland-02 · May 8, 2026, 3:12am

I'm reindexing documents with an ELSER inference pipeline to generate embeddings, but only a fraction of documents (usually around half) end up with embeddings despite the reindex completing successfully with no failures.

Setup:

Elasticsearch with ELSER model deployed and running
Source index: ~1000 documents
Destination index with rank_features mapping for embeddings
Pipeline: elser_double_embedding_pipeline

Reindex Request:

POST /_reindex
{
  "source": {
    "index": "v1214_test_staging_20260506_181845"
  },
  "dest": {
    "index": "v1214_test",
    "pipeline": "elser_double_embedding_pipeline"
  }
}

Task Status (Completed Successfully):

{
  "completed": true,
  "task": {
    "description": "reindex from [v1214_test_staging_20260506_181845] to [v1214_test]",
    "status": {
      "total": 977,
      "updated": 977,
      "created": 0,
      "deleted": 0,
      "batches": 2,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 2,
        "search": 0
      }
    }
  },
  "response": {
    "total": 977,
    "updated": 977,
    "failures": []
  }
}

Problem: Despite successful completion, only 450 out of 977 documents have embeddings:

GET /v1214_test/_count
{
  "query": {
    "bool": {
      "should": [
        {"bool": {"must_not": {"exists": {"field": "name_embedding"}}}},
        {"bool": {"must_not": {"exists": {"field": "content_embedding"}}}}
      ],
      "minimum_should_match": 1
    }
  }
}
// Returns: 527 documents missing embeddings

Pipeline Configuration:

GET /_ingest/pipeline/elser_double_embedding_pipeline
{
  "elser_double_embedding_pipeline": {
....
    "processors": [
      {
        "inference": {
          "model_id": ".elser_model_2_linux-x86_64",
          "input_output": [
            {
              "input_field": "content",
              "output_field": "content_embedding"
            },
            {
              "input_field": "name",
              "output_field": "name_embedding"
            }
          ]
        }
      }
    ]
  }
}

What I've Tried:

Running update_by_query with pipeline - same result
Checking for failures in task response - none reported
Verifying source documents have name and content fields - they do
Testing pipeline simulation - works correctly
Running reindex multiple times - consistently ~46% success rate

Questions:

Why would reindex report success but not apply the pipeline to all documents?
How can I debug which documents are failing to get embeddings when no failures are reported?
Is there a way to force the pipeline to process all documents or identify which ones were skipped?

The task description doesn't show the pipeline name (should show [elser_double_embedding_pipeline]), which makes me suspect the pipeline isn't being applied, but the API accepts the parameter without error.

Keith_Massey · May 8, 2026, 1:36pm

Have you checked the elasticsearch logs from all nodes? I'm wondering if something silently failed because the model was not available on some node where it was expected to be (just guessing)? If there is nothing in the logs, it might be worth rerunning with debug-level logging for the org.elasticsearch.xpack.ml.action package.

Keith_Massey · May 8, 2026, 2:02pm

Also, could you share your pipeline? Does your inference processor have ignore_failure set to true?

Topic		Replies	Views
Embedding generation using E5 failing for some records Elasticsearch elastic-stack-machine-learning	1	70	February 20, 2025
Ingest pipeline ELSER embedding fails with more than 1 ML node Elasticsearch elastic-stack-machine-learning , ingest-pipeline	1	238	January 25, 2024
Help with pipeline and reindex and search using E5 embedding model Elasticsearch elastic-stack-machine-learning	1	128	November 7, 2024
Simple ingest pipeline reindex not getting all documents Elasticsearch	1	392	June 28, 2018
Ingestion Failure with ML inference for E5 model Elasticsearch elastic-stack-machine-learning , painless , ingest-pipeline	3	450	June 4, 2024

Elasticsearch reindex with ELSER pipeline succeeds but only generates embeddings for fraction of documents - no failures reported

Related topics