Is there a way to instruct the update api to 'upsert if modified' ? From
what I can tell, a new document will be updated wether or not anything has
changed.
Say I get a daily snapshot of users and their information from some
external source. Most days ~85% of the user data will not have changed,
but there will be some users updating their information, as well as new
users. I'd prefer sending the entire data set to ES as an 'upsert if
modified' operation, rather than having to get each document from the
index, and chcek if there is a difference, and then issuing the update.
No there is no such capability. However, there is something that may be of
interest. When you update/insert, you can pass a version along with it.
This version can be set to "come from an external source" in which case you
have complete control over the semantics of the version and how updates
work. In the case of external version semantics, ES will update a document
only if your supplied external version is higher than the current version
of the document. You can generate version numbers yourself and you could
extract this from say a last_updated_timestamp from your external source IF
you have it. More details about this capability here:
Thanks. Unfortunately we have no such field. This may be a useful feature
to include in ES. Since the update doc is already being merged with the
existing doc, the isModified could easily be done during the merge.
On Friday, February 28, 2014 12:20:11 PM UTC-5, Andrew Mehler wrote:
Is there a way to instruct the update api to 'upsert if modified' ? From
what I can tell, a new document will be updated wether or not anything has
changed.
Say I get a daily snapshot of users and their information from some
external source. Most days ~85% of the user data will not have changed,
but there will be some users updating their information, as well as new
users. I'd prefer sending the entire data set to ES as an 'upsert if
modified' operation, rather than having to get each document from the
index, and chcek if there is a difference, and then issuing the update.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.