How can I avoid indexing duplicate data into Elasticsearch, how define my keys?


I am using only one index with one default document type and I have the same 4 keys for each record in my index.

when I am indexing records that are already exists in my elastic db, duplicate data is created because each record get a new document id. I want Elasticsearch to insert the record only if there is no record with the same keys in elastic db.

how can I "tell" elastic not to index record that its keys are already exists in elastic db?
how to define my keys the optimal way?


You can specify the document id when you index the document, e.g. based on the key you mentioned or possibly event type concatenated with the key (if you want to store each type with the key). If you combine this with a create request it should fail if that document id already exists. If you instead send an index request the existing document will get overwritten, but you will not have duplicates. If you want

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.