Is there a kind of query/rescore/similarity magic that lets me know if all the terms in a field are matched?


(Nik Everett) #1

I'm looking to boost matches that where all the terms in the field match
more than I'm getting out of the default similarity. Is there some way to
ask Elasticsearch to do that? I'm ok with only checking in some small
window of top documents or really anything other than a large performance
hit. To be honest I haven't played too much with similarities so maybe
what I want is there.

Thanks!

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3y1OME0C31M69Ugs71T%2BnU2b%2Bpyq45Wga71vOv1GTdTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Brian Yoder) #2

Nik,

No, there is not.

There's a work-around in which the number of terms in a field can be stored
in another field during indexing time. And then you can analyze your query
string to count the number of terms, and then use that count to match
against the documents that have the same count. But consider the following
field:

"text" : "Very Big Dog"

Three terms in the field's value, right?

And consider the query:

"+text:very +text:very +text:very"

As in:

{
"bool" : {
"must" : [ {
"match" : {
"text" : {
"query" : "very",
"type" : "boolean"
}
}
}, {
"match" : {
"text" : {
"query" : "very",
"type" : "boolean"
}
}
}, {
"match" : {
"text" : {
"query" : "very",
"type" : "boolean"
}
}
} ]
}
}

Three query terms, right?

But it will match the field, and the term counts will match, and therefore
you will then be told that Very Very Very is a perfect match for Very
Big Dog
.

Oops!

This is a Lucene limitation. Probably not a really big deal; I only know of
two search engines that can properly handle duplicate terms: Google's, and
the one I wrote in my previous life. But it is something that would be a
very nice and useful feature for Lucene. Since Lucene already knows the
word positions, it can verify that each term matches a unique word position
(which is what I did in mine).

Brian

On Thursday, January 9, 2014 11:18:50 AM UTC-5, Nikolas Everett wrote:

I'm looking to boost matches that where all the terms in the field match
more than I'm getting out of the default similarity. Is there some way to
ask Elasticsearch to do that? I'm ok with only checking in some small
window of top documents or really anything other than a large performance
hit. To be honest I haven't played too much with similarities so maybe
what I want is there.

Thanks!

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9bc2284e-0f01-4b4e-aded-93db0230d4c9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3