Bug in bulk requests with mix of requests with and without pipelines

I'm new here but not new to using the Elastic stack; I apologise if I get reporting this bug wrong.

This bug exists in Elasticsearch 7.8. I know it doesn't exist in 6.8 but I haven't looked at all the in-between versions.

If I put together a bulk request with a mix of indexing requests with some that call an ingest pipeline and some that don't then there exists a problem if either the pipeline fails or the pipeline contains a drop processor. The handling of the request plays around with the wrong "slot" number in the bulk request / response as it counts only requests with pipelines to figure out that "slot" for the callback in the "executeBulkRequest" method of "IngestService". This results in the wrong items being marked as failed or dropped. I haven't played with failures so much so that needs verifying but here are some steps to reproduce for the drop processor.

 curl -s -XPUT -H Content-type:application/json localhost:9200/_ingest/pipeline/mypipeline -d '{ "processors": [ { "drop": { } } ] }'

My bulk request...

{"index":{"_index":"myindex","_type":"_doc","_id":"1"}}
{"test":"1"}
{"index":{"_index":"myindex","_type":"_doc","_id":"2","pipeline":"mypipeline"}}
{"test":"2"}
curl -s -XPOST -H Content-type:application/json localhost:9200/_bulk --data-binary @bulk_request

The response...

{
  "took": 242,
  "ingest_took": 11,
  "errors": false,
  "items": [
    {
      "index": {
        "_index": "myindex",
        "_type": "_doc",
        "_id": "1",
        "_version": -3,
        "result": "noop",
        "_shards": {
          "total": 0,
          "successful": 0,
          "failed": 0
        },
        "status": 200
      }
    },
    {
      "index": {
        "_index": "myindex",
        "_type": "_doc",
        "_id": "2",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}

And if you search the index it contains the wrong document...

{
  "took": 55,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "myindex",
        "_type": "_doc",
        "_id": "2",
        "_score": 1,
        "_source": {
          "test": "2"
        }
      }
    ]
  }
}

I believe the fix should be relatively simple in IngestService but maybe I'm missing something.

Welcome to our community! :smiley:

If you think this is a bug, please create a new issue in https://github.com/elastic/elasticsearch/issues/.

Thanks Mark.

I've opened a bug at https://github.com/elastic/elasticsearch/issues/60437.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.