Query_string search with wildcard uses ConstantScoreQuery score


(mbro) #1

We're using queryString search and noting that a ConstantScoreQuery is
done when a wildcard '*' is appended to a term. Because
ConstantScoreQuery only takes into account the field boost, results
matching on a given field aren't ranked within that field (every
result matching on the same field has the same score).

For example, if we search for 'cyclist*', the explain text shows that
a ConstantScoreQuery is done:

Eric Biggs: 0.25 = max of:
0.25 = ConstantScoreQuery(tagSet:cyclist*^5.0), product of:
5.0 = boost
0.05 = queryNorm
Lorri Benyaker: 0.25 = max of:
0.25 = ConstantScoreQuery(tagSet:cyclist*^5.0), product of:
5.0 = boost
0.05 = queryNorm

But if we search for 'cyclist', then a weighted query is done:

Lorri Benyaker: 0.29494813 = max of:
0.29494813 = weight(tagSet:cyclist^5.0 in 179), product of:
0.20186047 = queryWeight(tagSet:cyclist^5.0), product of:
5.0 = boost
6.748756 = idf(docFreq=4, maxDocs=1569)
0.005982154 = queryNorm
1.4611485 = fieldWeight(tagSet:cyclist in 179), product of:
3.4641016 = tf(termFreq(tagSet:cyclist)=12)
6.748756 = idf(docFreq=4, maxDocs=1569)
0.0625 = fieldNorm(field=tagSet, doc=179)
Eric Biggs: 0.26329386 = max of:
0.26329386 = weight(tagSet:cyclist^5.0 in 177), product of:
0.20186047 = queryWeight(tagSet:cyclist^5.0), product of:
5.0 = boost
6.748756 = idf(docFreq=4, maxDocs=1569)
0.005982154 = queryNorm
1.3043358 = fieldWeight(tagSet:cyclist in 177), product of:
4.1231055 = tf(termFreq(tagSet:cyclist)=17)
6.748756 = idf(docFreq=4, maxDocs=1569)
0.046875 = fieldNorm(field=tagSet, doc=177)

Is there any way to change this behavior so query terms ending with an
asterisk use a weighted score?

Thanks.

Mike


(Clinton Gormley) #2

Hi Mike

Is there any way to change this behavior so query terms ending with an
asterisk use a weighted score?

Use the edge-ngram analyzer to index your data

You may then want to use a bool query to boost whole-word matches, eg:

"query" : {
"bool" : {
"must" : [
{
"field" : {
"tokens.ngram" : "+foo +bar"
}
}
],
"should" : [
{
"field" : {
"tokens" : {
"boost" : 1,
"query" : "foo bar"
}
}
}
]
}
}

clint


(system) #3