Hello, I ingest some data into my cluster, and turned out some of my data has been ingested twice, so some of the data is doubled.
My index has unique field, because some of it is doubled, I'm trying to search for duplicate field by using the terms aggregation, here's my command
GET <index>/_search
{
"size": 10000,
"aggs": {
"duplicateNames": {
"terms": {
"field": "EmployeeName",
"min_doc_count": 2
}
}
}
}
from that command, I can find the duplicate value. But is there any way to delete just the "duplicate" value and skip the original one?
tldr can I delete one from two same docs value?
Any help is appreciated, Thanks