Phrase suggestor: How to prefer smaller correction distances?

srd · June 27, 2017, 7:49am

I'm trying to create a phrase suggestor query for a corpus of a mix of german and english (thanks marketing guys) words, but the people using the search will be german, hence expecting primarily german suggestions.

I'm currently stumped wrt how to configure the phrase suggestor to be more "natural" in its suggestions. For example, we have a bunch of documents with the word "Stift" (writing utensil) in them. Using the examples in the phrase suggestor documentation search for "Stif" (missing t at the end) results in the suggestion of "Star" (Levenshtein distance 2) instead of the more natural "Stift" (Levenshtein distance 1).

I've tried adding combinations of the german normalization and german stemmer to the trigram analyzer, but then I get a suggestion of "seit" (Levenshtein distance 3).

I've also tried changing the smoothing, gram size and confidence parameters, with no change in the results.

Any pointers as to how to get the suggestor to prefer the shorter distances?

system · July 25, 2017, 7:49am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there a way to influence phrase suggester candidates by specifying weights for tokens / docs? Elasticsearch	1	300	August 24, 2018
Controlling Score on Phrase Suggester Elasticsearch	3	592	March 2, 2018
Strange behaviour phrase suggester Elasticsearch	1	297	July 6, 2017
Higher edit distances for Phrase suggester Elasticsearch	3	456	July 6, 2017
Phrase suggester Elasticsearch	7	350	July 6, 2017

Phrase suggestor: How to prefer smaller correction distances?

Related topics