The Best Practice to Updates Lot of Documents

Hi guys,

need suggestion and recommendation. I have documents with this body and items is nested object.

{
  "subtotal": 1000,
  "multiplier": 2,
  "total": 2000,
   "items": [
      {
         "subtotal": 100,
         "multiplier": 2,
         "total": 200
      },
      {
         "subtotal": 500,
         "multiplier": 2,
         "total": 1000
      }
   ]
}

So, total = multiplier * subtotal. Multiplier is determined by User on my application. When multiplier is changed to another value, I have to update multiplier and total in all documents. I'm pondering how to achieve this with the best performance. I came up with strategies below:

  1. Update_By_Query
    I don't have any idea how to update nested object fields using update_by_query. Is it also possible to update those fields in native script plugin (Java)? (in my real case I'm using native script)

  2. Batch GET and Bulk UPDATE
    The last strategy that I'm thinking of. In my code, get all documents, looping each document to update the multiplier and total and send them back to elasticsearch. I'm worry about the performance especially when there are 100,000 documents.

Is there any other recommended strategy? What do you suggest?

Thank you :bow:

Hi,

Update_By_Query with a groovy script (in 2.4. in my case) works well.
Copy your script on all nodes, and have a code like:

UpdateByQueryRequestBuilder contactRequestBuilder = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
BulkIndexByScrollResponse contactResponse = contactRequestBuilder
        			.source(documentType.getIndexName())
        			.script(new Script("adddel_intlong_in_listfield", ScriptService.ScriptType.FILE, "groovy", scriptParams))
        			.filter(contactQuery).get();

bye,
Xavier

@xavierfacq
Thanks for your answer

What did you do in the script? Did you add some columns there?

Hi,

My script add or remove Long value from an array field, you can find it here:

bye,
Xavier

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.