I am trying to move away from scripted fields in our Elasticsearch cluster. I have updated our Logstash config files to add the relevant fields to new incoming documents, but am having trouble figuring out how to update existing documents. I set up an ingest pipeline that runs a Painless script to add the fields if they don't exist in a document. I am running an update_by_query in Kibana dev tools like this:
POST log-logname*/_update_by_query?pipeline=Add-hr-day-fields&conflicts=proceed
{
"query":{
"range":{
"@timestamp": {
"gte": "2022-02-26",
"lte": "2022-02-28"
}
}
}
}
This does add the fields to the documents, but it seems to keep running after all the relevant documents have been updated. I am checking that with this command:
GET _tasks?detailed=true&actions=*byquery
I think it has something to do with version conflicts, since I see version conflict errors if I don't add conflicts=proceed to the command.
So, is running an update by query the preferred way to do this, or should I do it another way? What causes the version conflicts and how can I avoid them?
Thanks much.