Entity/Identity resolution

Yann_Barraud · April 12, 2013, 7:51am

Hi,

I'm currently working on search engines, data cleaning and so on these last
days. The challenge I'm facing right now is explaining that a search engine
on its own can not be used for identity resolution. Lucene posts made
things easier (http://wiki.apache.org/lucene-java/ScoresAsPercentages
& http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_filter_by_score.3F
). http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_filter_by_score.3F

I've been playing with Duke project also, for batch data deduplication.
It's been very powerful, and covering requirements for batch needs.

Now I'm wondering if there is not an opportunity to merge at some points
the two projets to get some fast live identity resolution service.

I'd say :

duke delegates data analysis & indexing to ES (as they both rely on
Lucene indexes)
duke turns into an ES plugin to get records matching query with
Bayesian probability as an output.

Regards,
Yann Barraud

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Entity/Identity resolution Elasticsearch	16	1573	July 6, 2017
ANN : elasticsearch-entity-resolution plugin 0.1 Elasticsearch	6	759	July 6, 2017
Probabilistic Record Linkage using Elastic Search Elasticsearch	8	3476	July 5, 2017
Elastic search has less results than lucene Elasticsearch	3	342	July 6, 2017
Search IMprovement Request Elasticsearch	2	303	July 6, 2017