Remove duplicate / multiple data records in Elastic

Hi, due to an error in the program that writes data records to the Elastic, there are now a lot of entries in our Elastic, all with exactly the same value in the "time" attribute. For example, there are 50 data records in which the value "Aug 5, 2021 @ 10:39:37.720" is in the "time" field (The "time" field is filled by a program of us, not by the elastic). In this way there are many more entries where the time is exactly the same to the millisecond. Now I would remove all of the multiple entries so that there is only 1 entry for a time. That theoretically 2 different entries are displayed at the same time can be ignored. At the moment there are around 630,000 records in the Elastic, around 300,000 are probably double or multiple.

Unfortunately, I am not very familiar with Elastic Query. How can all duplicate or multiple entries be identified and removed?

Is there also a way to hide the duplicate entries, e.g. using a filter in the Kibana?

Elastic Version is 7.11.2.

See Delete by query API | Elasticsearch Guide [7.14] | Elastic

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.