Using fuzzy query to find near-duplicates

mnd91 · January 13, 2016, 9:17am

Hello,

Im new to the ELK stack. So far what i was able to do is configure logstash, send data to elasticsearch, and create visualisations on Kibana.

The data im sending, using an index called name for example, is a set of first-name and last-name; those being the names of the fields and a third field "cname" that is the concatenation of both fields. Other fields also exist (ID, address etc).

I would like to know if i can apply the fuzzy search query to generate a list of all "cname" for all near-duplicates, which means a list of all cname that might be considered as duplicates and has 1 or 2 different letters.

I can provide any additional information.
Thank you

Topic		Replies	Views
How to find duplicate numbers in multiple fields? Elasticsearch	7	1381	September 19, 2019
Fuzzy Aggregations Elasticsearch	4	2328	May 30, 2017
Fuzzy matching and direct hit ranking Elasticsearch	10	1654	July 6, 2017
Querying aggregation results Elasticsearch	2	394	July 5, 2017
Help with elastic search fuzzy query Elasticsearch	3	469	July 6, 2017

Using fuzzy query to find near-duplicates

Related topics