Using fuzzy query to find near-duplicates


#1

Hello,

Im new to the ELK stack. So far what i was able to do is configure logstash, send data to elasticsearch, and create visualisations on Kibana.

The data im sending, using an index called name for example, is a set of first-name and last-name; those being the names of the fields and a third field "cname" that is the concatenation of both fields. Other fields also exist (ID, address etc).

I would like to know if i can apply the fuzzy search query to generate a list of all "cname" for all near-duplicates, which means a list of all cname that might be considered as duplicates and has 1 or 2 different letters.

I can provide any additional information.
Thank you


(system) #2