Dedupe Identification and delete the Old entries of unique key like Task Id

Hello, We have recently moved all the Tasks from Java Application to ELK Stack but we understood duplicates got created in the ELK. We are getting incorrect reports because of it. We have a volume of 40 million records. We want to create a Job in painless(or other way) to Identify all the duplicates and delete the old entries in the ELK. I was thinking pick records in batches, and then for each task Id, fetch all task Id's and delete the old entries. That should solve problem.

From Elastic Search to Elasticsearch

Removed elastic-app-search

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.