Java Update Request (Upsert) pipeline not (supported) / working

ES version: 6.3.1

There is no way to setPipeline on UpdateRequest API. So tried the below and sadly it doesn't work.
The existing document is being updated but without the pipeline transformations applied.

Please can the experts suggest some solution or alternatives.

 IndexRequest request = new IndexRequest(indexConfig.getIndexName(), indexConfig.getIndexType(), docId)
                .source(source);
        request.setPipeline(indexConfig.getPipeline());

        if (appConfig.isUpdateRequest()) {
            UpdateRequest upsertRequest = new UpdateRequest(indexConfig.getIndexName(), indexConfig.getIndexType(),
                    docId).doc(source).upsert(request);
            bulkProcessor.add(upsertRequest);
        } else {
            bulkProcessor.add(request);
        }

It's not available. See the whole discussion at:

Very disappointed to hear that to be honest

We perform bulk indexing all the time. First we do an initial bulk indexing with pipelines. After that we so delta indexing again on bulk mode and here we need the same pipelines applied so thar the end data on the documents are the same as the one during initial indexing.

No support for pipelines on bulk update means I have to either call update by query with pipeline post index update or remove pipeline altogether and stick the pipeline logic in the code which is bad. :frowning:

What kind of processor are you using in the pipeline?

Mainly lowercase and replace processors. For an upcoming project, we have to probably use a complex script pipeline processor.

Why using update API then? Why not reindexing the whole document?

Update API should use IMHO only in 2 cases:

  • Huge documents, like megabytes of Json
  • Usage of the attachment processor with an array or with big binary documents

Otherwise, I'd recommend using the index API.

1 Like

Thanks for the reply David.

Here is what we do:

  1. Initial indexing (bulk / insert / IndexRequest) - We pull all entities to be indexed from an application REST endpoint.
  2. Delta indexing (bulk / update / UpdateRequest with docAsUpsert) - Here we pull all entities created or modified as of a given point in time. In response, we may get nothing or may get even a million entities. In this scenario, we will have to update documents if they already exists or create new ones if they don't.

So we have to apply the same pipelines in both routes. This way the field values are in-tact.

What's happening now is:

  1. Initial indexing - field named 'status' is converted to lowercase via lowercase processor. The values are active & inactive.
  2. Delta indexing - as pipeline is not applied, the value in the status field changes to ACTIVE / INACTIVE / Active / Inactive etc.

Yes, we can workaround this but I strongly feel you must consider supporting pipeline on Bulk UpdateRequest.

I strongly feel you must consider supporting pipeline on Bulk UpdateRequest.

Yeah. We did. But as written in this comment:

Discussed during fix it friday and this looks like a useful enhancement, but there are corner cases which would make it very tricky to support this. (index name or routing is changed during ingestion or when a node isn't allowed to run ingest) Therefor I'm closing this issue and we can re-evaluate this at a later time if this is still useful and the technical concerns can fixed easily.

A workaround could be may be using the update by query instead as this is supporting ingest. With the price of slowness...

Another workaround would be to simulate that by yourself by calling the _simulate ingest endpoint and then send the result using the update API.

Another one is described here: https://github.com/elastic/elasticsearch/issues/17895#issuecomment-357661426

IndexRequest (bulk) is able to do insert / update. Caught me by surprise. :slight_smile:

That's what I tried to say at Java Update Request (Upsert) pipeline not (supported) / working

Thanks David. Sorry. I didn't read your post properly earlier as I was in full steam on that day. Thanks again for your quick reply. Much appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.