Avoid duplicate insertions

Cristina_Marletta_Li · February 20, 2025, 11:42am

Hi,
What is the best method to avoid duplicate insertions (based on one or two fields) in an index? I am thinking of a mechanism like Primary Key of relational databases.

Thanks for your suggestions.

Cristina

RainTown · February 20, 2025, 11:52am

The best method is avoiding it in the first place (which is also the answer for relational databases),

What do you want to actually do with the "second" document that arrives, just bin it, overwrite the existing document that matches your 2 fields, raise an alarm, ... ? Is there a time series (meant in most general sense) element to the documents?

There are workarounds to do what I think you are asking, but many don't (easily) scale.

Cristina_Marletta_Li · February 20, 2025, 1:51pm

actually, the document with duplicate key is exactly the same. So can I overwrite it?

elasticforme · February 20, 2025, 1:58pm

if your document is uniq use "_id" as custom field. and elastic will overwrite.

but overwrite is little different. it will first delete the record and insert it. two operations.

in this case you can't use datastream. you have to use regular index

Topic		Replies	Views
How can I avoid indexing duplicate data into Elasticsearch, how define my keys? Elasticsearch	2	575	July 11, 2020
Force elasticsearch uniqueness constraint Elasticsearch	4	469	July 6, 2017
Dealing with duplicate documents Elasticsearch	4	1495	July 5, 2017
Force elasticsearch uniqueness constraint Elasticsearch	1	299	July 6, 2017
How to prevent duplicates in ElasticSearch 2.X Elasticsearch	5	7692	July 5, 2017

Avoid duplicate insertions

Related topics