I am trying to move away from scripted fields in our Elasticsearch cluster. I have updated our Logstash config files to add the relevant fields to new incoming documents, but am having trouble figuring out how to update existing documents. I set up an ingest pipeline that runs a Painless script to add the fields if they don't exist in a document. I am running an update_by_query in Kibana dev tools like this:
This does add the fields to the documents, but it seems to keep running after all the relevant documents have been updated. I am checking that with this command: GET _tasks?detailed=true&actions=*byquery
I think it has something to do with version conflicts, since I see version conflict errors if I don't add conflicts=proceed to the command.
So, is running an update by query the preferred way to do this, or should I do it another way? What causes the version conflicts and how can I avoid them?
Thanks for the reply! Here is a representative error:
"failures" : [
{
"index" : "log-logname",
"type" : "_doc",
"id" : "rtgor16o001m01_Se0/0/0:1.64.1644254100",
"cause" : {
"type" : "version_conflict_engine_exception",
"reason" : "[rtgor16o001m01_Se0/0/0:1.64.1644254100]: version conflict, required seqNo [5356283], primary term [3]. current document has seqNo [9338852] and primary term [5]",
"index_uuid" : "XNeenpEfSrWEBSqhbnBohg",
"shard" : "0",
"index" : "log-logname"
},
"status" : 409
},
I get many of these if I don't run the update by query with the conflicts=proceed option.
Also, sometimes I see an error about trying to create too many scroll contexts, should I just add the option to create more?
"type" : "exception",
"reason" : "Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting."
},
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.