The Best Practice to Updates Lot of Documents

(Budi Irawan) #1

Hi guys,

need suggestion and recommendation. I have documents with this body and items is nested object.

  "subtotal": 1000,
  "multiplier": 2,
  "total": 2000,
   "items": [
         "subtotal": 100,
         "multiplier": 2,
         "total": 200
         "subtotal": 500,
         "multiplier": 2,
         "total": 1000

So, total = multiplier * subtotal. Multiplier is determined by User on my application. When multiplier is changed to another value, I have to update multiplier and total in all documents. I'm pondering how to achieve this with the best performance. I came up with strategies below:

  1. Update_By_Query
    I don't have any idea how to update nested object fields using update_by_query. Is it also possible to update those fields in native script plugin (Java)? (in my real case I'm using native script)

  2. Batch GET and Bulk UPDATE
    The last strategy that I'm thinking of. In my code, get all documents, looping each document to update the multiplier and total and send them back to elasticsearch. I'm worry about the performance especially when there are 100,000 documents.

Is there any other recommended strategy? What do you suggest?

Thank you :bow:

(Xavier Facq) #2


Update_By_Query with a groovy script (in 2.4. in my case) works well.
Copy your script on all nodes, and have a code like:

UpdateByQueryRequestBuilder contactRequestBuilder = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
BulkIndexByScrollResponse contactResponse = contactRequestBuilder
        			.script(new Script("adddel_intlong_in_listfield", ScriptService.ScriptType.FILE, "groovy", scriptParams))


(Budi Irawan) #3

Thanks for your answer

What did you do in the script? Did you add some columns there?

(Xavier Facq) #4


My script add or remove Long value from an array field, you can find it here:


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.