Is it possible to use update_by_query with the bulk api?
I'm constantly polling data from mysql and updating the relevant ES documents- however since I don't match on the document IDs (I match on various keys of a document), I have been using the update_by_query function.
I'd like to bulk these requests since I'm making thousands of them, but it doesn't seem possible to use these two features together.
To my knowledge, I don't believe update-by-query works with bulk. In fact, it's essentially doing bulk updates under the hood. UPQ works by executing a query to find all matching documents, collecting the IDs, then issuing a bulk request with an update action for each document.
A bulk update-by-query could be very expensive, since it would send off many search and bulk requests simultaneously.
Is there any way that you can tie the document's ID in elasticsearch to an ID in mysql? Update-by-query is useful, but also rather expensive. Doing thousands of them sounds like it will be putting a lot of strain on your cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.