I am using only one index with one default document type and I have the same 4 keys for each record in my index.
when I am indexing records that are already exists in my elastic db, duplicate data is created because each record get a new document id. I want Elasticsearch to insert the record only if there is no record with the same keys in elastic db.
how can I "tell" elastic not to index record that its keys are already exists in elastic db?
how to define my keys the optimal way?
You can specify the document id when you index the document, e.g. based on the key you mentioned or possibly event type concatenated with the key (if you want to store each type with the key). If you combine this with a create request it should fail if that document id already exists. If you instead send an index request the existing document will get overwritten, but you will not have duplicates. If you want
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.