Is it possible to detect duplicated entries for a doc?


(Aid Hamza) #1

Hey guys,
I'm trying to create a microservice for our website in order to prevent duplicated entries from our users, if the user add a property; then the moderator can check if it matches a duplicated / very similar one from Elastic

The query will contain "title", "body" to check if any property has similar fields about >70% similarity, and I pass also user_id and phone number to filter only the user previous properties

Given my limited knowledge with full-text search engine, this is the query wrote after digging in the web, if it suitable for my need ?

{
  "query": {
      "filtered": {
          "query": {
              "bool": {
                  "must": [
                      {
    			        "dis_max": {
				           "queries": [
				               {
				                "more_like_this" : {
				                    "fields" : ["subject"],
				                    "like_text" : "Apartment located in Casablanca, 59m",
				                    "min_term_freq" : 1,
				                    "max_query_terms" : 1,
				                    "boost": 20                                    
    			                }
				               },
				               {
				                "more_like_this" : {
				                    "fields" : ["body"],
				                    "like_text" : "I want to sell an apartment etc ..... ",
				                    "min_term_freq" : 1,
				                    "max_query_terms" : 1,
				                    "boost": 20
                            }
				               }
				            ]
				        }
                      },
                        {
                           "match": {
                              "phone": "0652808029"
                           }
                        }
                  ]
              }
          }
      }
  }
  
}

Thanks


(system) #2