Hello everyone,
I am using python helpers to update an elasticsearch index. If document already exists, update script appends to an array of json objects. It takes about 1 hours to update 10 million records. Are upserts is ES generally that slow? or I am dong something wrong? I would appreciate any insights on this. Thanks!
Here is a snippet of python script I am using:
action = {
"_op_type":'update',
"_id": ,
"_routing": ,
"upsert":{ "id" : ,
"val1" : ,
"val2" : ## nested objects in the schema
},
"script" :"my-script",
"params" : {
"new_val1" : ,
"new_val2" :
}
}
actions.append(action)
if len(actions) == 1000 :
response = helpers.streaming_bulk(es, actions, index='index',
doc_type='type', chunk_size=1000, request_timeout=400)
my-script : ctx._source.val2 += new_val2; ctx._source.val1 += new_val1