On Mon, Sep 16, 2013 at 4:08 PM, Ark firstname.lastname@example.org wrote:
How can I model the query and/or mapping so that a partial match of a
sub-phrase has an higher score than what a edgengram would return?
For example, If I have four documents:
- foo bar blah
- foo blah bar
- bar foo blah
- bar blah foo
If the search string is "bar bl", I would like document 1 and 4 should be
scored higher than document 2 and 3.
If the field is indexed using edgengram, all 4 documents would match
(which is fine for my use-case) but I think the scoring cannot yield the
result I am looking for.
There is also a "match_phrase_prefix" but that would match only #4.
You could use the edgeNGram filter on top of the shingle filter (with
output_unigrams=false). This would allow you to boost on prefixes and
positions at the same time.
The fact that you are interested in prefix matches makes me wonder whether
you are trying to implement auto-completion: if this is the case, a better
option could be to use the completion suggest (which is way faster than
any index-based solution) and use all suffixes of your text as inputs. For
example, the "foo bar blah" suggestion could be indexed with "input": ["foo
bar blah", "bar blah", "blah"]. If you are not trying to implement
auto-completion, you can safely ignore this comment.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
For more options, visit https://groups.google.com/groups/opt_out.