Dedupe Identification and delete the Old entries of unique key like Task Id

Aditya_Macherla · August 30, 2024, 7:45pm

Hello, We have recently moved all the Tasks from Java Application to ELK Stack but we understood duplicates got created in the ELK. We are getting incorrect reports because of it. We have a volume of 40 million records. We want to create a Job in painless(or other way) to Identify all the duplicates and delete the old entries in the ELK. I was thinking pick records in batches, and then for each task Id, fetch all task Id's and delete the old entries. That should solve problem.

Sean_Story · August 30, 2024, 8:58pm

From Elastic Search to Elasticsearch

Sean_Story · August 30, 2024, 8:58pm

Removed elastic-app-search

system · September 27, 2024, 8:58pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Duplicates in ES Index Logstash	2	256	January 1, 2021
Identify and delete duplicates on several indexes Elasticsearch	1	1942	January 9, 2018
Duplicate Deletion in Elasticsearch 2.X Elasticsearch	2	569	July 25, 2017
Deduplication of records with deletion code Elasticsearch	1	1281	April 3, 2018
Duplicate entries into Elastic Search Logstash	8	2118	June 5, 2019

Dedupe Identification and delete the Old entries of unique key like Task Id

Related topics