Update By Query API

I certainly thought about adding such a thing when I built update_by_query but I never ended up doing it. Originally I didn't implement it because I was try to implement the smallest thing that worked. Now I don't think it is a good choice. The "merge a partial document" behavior of update is from a time long before Elasticsearch had a fast and safe scripting language. I suppose "fast" isn't all that important here because the performance of update-by-query is dominated by indexing the documents, but "safe" is crucial here. The idea of the "merge" behavior was so that you didn't have to send big documents over the wire to Elasticsearch and that is legit nice. A few years ago when it wasn't safe to enable inline scripting by default that was the only way to safely do that. But now that we have painless we can write a script to make the change explicit.

So I admit I'm biased:

  1. I've worked on painless and I'm proud of it.
  2. The specifics of the "merge a partial document" behavior always confused me. Objects are merged but lists are replaced, at least that is how I remember it. I prefer to be explicit for this.