I have an ES index containing several million records. I'm adding a field to the document's mapping, afterwards I'll have to go back and calculate the value for this new field for every existing document. The value can be different for every document, so I don't think the 'update by query' idea pertains here since I'll have to calculate the value elsewhere.
Is there an efficient way of doing this?
Is the best way to do this to just pull the IDs of all documents in batches of 1000 or so, then for each batch calculate the value and update the documents using the bulk API?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.