I have ELK set up . In specific index there are around 70Lac records , out of which some are duplicates . I need to create a filter in kibana so that only uniq records should be return. I need to check for two field for the records uniqness . Example , phone number and city fields of Document 1 and Document 2 are same then we consider those Documents are same , in this case query should return any one of these to Documents. here Id is auto generated we can't use that for uniqness. Need to help on this.
Why does your data have duplicates? Are the duplicates needed for other use cases? How is your data getting ingested? I would recommend removing duplicates during ingest. You can specify your own
_id field. Maybe it makes sense to make this field be the concatenation of phone number and city.
Data already ingested. Now I need to read data based on the above condition . I need to do operations already existing data.