Entity Matching in ES

How can we do Entity matching in ES ?

Eg. A Company name can have different variations.

  1. USA Tech Ltd
  2. USA Tech LLC
  3. USA Tech Asia Ltd

If the above data is present in ES in the Name field, and a 4th value =
"USA Euro Tech Ltd" then it should identify that all the Names are same.

How can we do that in ES ?

Right now, I am trying to use Fuzzy on the complete data set (~100K to 1Mn
docs) and getting the top 20 matches, loading them in memory and running an
external Jaro Wrinkler library ( Java-Lucene) on the 20 matches.

Is there a way to directly do Entity matching on the fields in ES ?


You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/39d8d73e-bfe0-491f-8c33-90bb0d5426b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.