How to find Similar documents

basharz · March 27, 2016, 12:23pm

Guys,
let us say i have ElasticSearch index and it contains about 10,000 documents, i did random manual check on the field "building name" for few documents and found there are a lot of documents matching each others based on this field "building name" is there any way i can find all documents that are similar to each others using Elasticsearch capabilities

Glen_Smith · March 27, 2016, 3:06pm

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html

nik9000 · March 27, 2016, 5:02pm

Look at "more like this".

basharz · March 28, 2016, 2:13pm

thanks guys for your feedback, actually our challenge is that we do not have a specific text that we want to compare it ith all documents, each set of documents could be matching each others based on specific text, if we will use the sample below queries we need to identify the text for the search which is not the case.
or if there is no solution we will do a development API work to go through all documents and apply the below queries and then group the documents as per the scoring results.
this is the sample MLT query
{
"query": {
"more_like_this": {
"fields": [
"building"
],
"like_text": "my text will be here",
"min_term_freq": 1,
"max_query_terms": 25
}
}
}

This is sample phonetic-matching query

{
"query": {
"match": {
"building.phonetic": {
"query": "my text will be here",
"operator": "or"
}
}
}

Topic		Replies	Views
How to find document similarity in ElasticSearch? Elasticsearch	2	446	July 5, 2017
Find similar records through MLT from millions records Elasticsearch	1	306	January 24, 2019
Search for similar documents Elasticsearch	4	1850	July 6, 2017
Like_text required? Elasticsearch	5	442	July 6, 2017
Elasticsearch more_like_this Elasticsearch	1	666	July 5, 2017

How to find Similar documents

Related topics