Using ES as a dictionary server - need advice

ElasticSearch_Users_ · January 21, 2015, 10:02pm

I'm working on a solution that will act as a dictionary validator by
performing the following:

input: phrase
processing: shingles phrase match with fuzziness
output: rewritten phrase
data: dictionary like, with entries that are short phrases up to 5 words
(e.g "know it all", "merry go round")

What's particular about this use case is that we don't care about TF / IDF
and have another mechanism in mind to select an entry (but that's not the
issue).

The issue is that all started well, with queries involving a phrase
suggester, direct generator and collation, but that's where we hit a snag
with issues of fuzzy matches (edit distance >0) ranking higher than exact
matches...

I've been discussing this in another thread
(https://groups.google.com/forum/#!searchin/elasticsearch/bose/elasticsearch/dLdT90j1x74/zqJQiSlgHv8J)
but I wanted to present my use case a bit more clearly and see if there are
any advices to achieve the purpose.

I tried to use FLT, as kindly recommended by Mark Harwood but didn't figure
out how to use it as phrase suggester.

The key here I think is to control the scoring of the suggester, by not
accounting for TF / IDF and instead just provide a ranking by a n-gram
formula involving edit distance for further custom processing to select the
right suggester entry. I looked at smoothing models, but everything seems
to be based, to a +/- extent, on TF / IDF.

Any advice would be appreciated!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/82ed7214-0659-4140-a5cc-27c5905f1d7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch API instability Elasticsearch	6	1268	July 6, 2017
Spellchecking with term and phrase suggesters Elasticsearch	4	383	July 6, 2017
Applying filters to the phrase suggester's scoring Elasticsearch	3	339	July 6, 2017
Trouble with Phrase Suggester Elasticsearch	3	611	September 7, 2017
Is there a way to influence phrase suggester candidates by specifying weights for tokens / docs? Elasticsearch	1	300	August 24, 2018

Using ES as a dictionary server - need advice

Related topics