Is it possible to query ordered string in Elasticsearch?


(少爷允之) #1

I'm wondering is it possible to query ordered string in Elasticsearch? For
example, the query could be as follows

{
"explain": true,
"query": {
"fuzzy_like_this": {
"fields": [
"domainElemString"
],
"like_text": "engagement definition research resolution"
}
}
}

and there are two docs with a value string of field "domainElemString", one
value string is all the same with this like text and the words in the other
one stated in different order, for instance, "definition engagement
research resolution". I used flt query because it is based on edit
distance, so I thought the doc with same field value should has the higher
score than the other one. But the explanation in hits seems not so, the
scores of these two docs are almost same except idf.

Am I missing something about FLT query? Is there any way to do such things?
Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Clinton Gormley) #2

Hiya

The FLT query doesn't take word position into account. The edit distance
refers to changes to characters in a word, eg "resolutionS" -> "resolution"
has an edit distance of 1. This is the basis for "fuzzy" matching in Lucene
and Elasticsearch.

If you want to take position into account, then you should use the
"match_phrase" query with a "slop" parameter (where slop is a bit like edit
distance, but for word positions). Note: in a match_phrase query, all
words must be present, so you probably want to add this as a "should"
clause inside a "bool" query. (ie if the match_phrase clause matches, then
it increases the score for that document, but it is not required to match)

clint

On 12 September 2013 22:43, 少爷允之 chang.zhang29@googlemail.com wrote:

I'm wondering is it possible to query ordered string in Elasticsearch? For
example, the query could be as follows

{
"explain": true,
"query": {
"fuzzy_like_this": {
"fields": [
"domainElemString"
],
"like_text": "engagement definition research resolution"
}
}
}

and there are two docs with a value string of field "domainElemString",
one value string is all the same with this like text and the words in the
other one stated in different order, for instance, "definition engagement
research resolution". I used flt query because it is based on edit
distance, so I thought the doc with same field value should has the higher
score than the other one. But the explanation in hits seems not so, the
scores of these two docs are almost same except idf.

Am I missing something about FLT query? Is there any way to do such
things? Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(少爷允之) #3

Thanks a lot, It works now.

在 2013年9月13日星期五UTC+2上午11时33分18秒,Clinton Gormley写道:

Hiya

The FLT query doesn't take word position into account. The edit distance
refers to changes to characters in a word, eg "resolutionS" -> "resolution"
has an edit distance of 1. This is the basis for "fuzzy" matching in Lucene
and Elasticsearch.

If you want to take position into account, then you should use the
"match_phrase" query with a "slop" parameter (where slop is a bit like edit
distance, but for word positions). Note: in a match_phrase query, all
words must be present, so you probably want to add this as a "should"
clause inside a "bool" query. (ie if the match_phrase clause matches, then
it increases the score for that document, but it is not required to match)

clint

On 12 September 2013 22:43, 少爷允之 <chang....@googlemail.com <javascript:>>wrote:

I'm wondering is it possible to query ordered string in Elasticsearch?
For example, the query could be as follows

{
"explain": true,
"query": {
"fuzzy_like_this": {
"fields": [
"domainElemString"
],
"like_text": "engagement definition research resolution"
}
}
}

and there are two docs with a value string of field "domainElemString",
one value string is all the same with this like text and the words in the
other one stated in different order, for instance, "definition engagement
research resolution". I used flt query because it is based on edit
distance, so I thought the doc with same field value should has the higher
score than the other one. But the explanation in hits seems not so, the
scores of these two docs are almost same except idf.

Am I missing something about FLT query? Is there any way to do such
things? Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4