How to identiry duplicates and delete it in index

Naga_Prudhvi · June 22, 2022, 5:13am

we are using elasticsearch 7.11.1 recently we observed an issue and below are the points for it.

we store data for every 15 mins interval and we get time stamp from our input file (ex: 05:00, 23:15, 20:30, 11:45 )
recently we observed our input file at 23:15 has 1890 records, but index has 3533 records.
now we want to delete 1643 duplictae records from index,with out disturbing 1890 records.

We need API query for that.

for example

input file

name product sale id
sai pen 100 1
kumar car 30 2
sai pen 100 1
sai pen 100 1
ram bike 288 3
kumar car 30 2

After deleting duplicates my index should loook like below,

name product sale id
sai pen 100 1
ram bike 288 3
kumar car 30 2

Naga_Prudhvi · June 22, 2022, 5:17am

I need help with

query to find only duplicates at 23:15
query to delete duplicates

RabBit_BR · June 22, 2022, 1:40pm

This post is not new but it can give you a direction on how to perform the operation.

RabBit_BR · June 22, 2022, 1:42pm

It seems to be the same problem.

Naga_Prudhvi · June 23, 2022, 3:46am

We are looking for an API query approch

Naga_Prudhvi · June 23, 2022, 7:07am

we are looking for an API solution , and we didnt get that, so we are searching for the same,

Much appreciated if get help with API query

Christian_Dahlqvist · June 23, 2022, 7:12am

As I pointed out in my response I do not believe there is a query that can be used with delete by query to do this (which seems to be what you are looking for). I would therefore recommend looking at the approaches described in the blog post I linked to.

Most people frequenting the forums respond only if they have a solution, so if you do not receive any solution within a reasonable time period it is quite possible that what you are looking for is not possible.

system · July 21, 2022, 7:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to identify and remove duplicates in Elasticsearch index Elasticsearch	3	276	July 20, 2022
How to identify and remove duplicates in Elasticsearch index Elasticsearch	4	3462	July 20, 2022
Deleting duplicates in index using API query Elasticsearch	2	281	June 23, 2022
Identify and delete duplicates on several indexes Elasticsearch	1	1935	January 9, 2018
Delete duplicate items Elasticsearch	1	321	July 6, 2017

How to identiry duplicates and delete it in index

Related topics