Hi,
What is the best method to avoid duplicate insertions (based on one or two fields) in an index? I am thinking of a mechanism like Primary Key of relational databases.
Thanks for your suggestions.
Cristina
Hi,
What is the best method to avoid duplicate insertions (based on one or two fields) in an index? I am thinking of a mechanism like Primary Key of relational databases.
Thanks for your suggestions.
Cristina
The best method is avoiding it in the first place (which is also the answer for relational databases),
What do you want to actually do with the "second" document that arrives, just bin it, overwrite the existing document that matches your 2 fields, raise an alarm, ... ? Is there a time series (meant in most general sense) element to the documents?
There are workarounds to do what I think you are asking, but many don't (easily) scale.
actually, the document with duplicate key is exactly the same. So can I overwrite it?
if your document is uniq use "_id" as custom field. and elastic will overwrite.
but overwrite is little different. it will first delete the record and insert it. two operations.
in this case you can't use datastream. you have to use regular index
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.