Getting different results while using bool query vs bool query with function score query

I am trying to add a custom boost to the different should clauses in the
bool query, but I am getting different number of results when I use the
bool query with 2 should clauses containing 2 simple query string query vs
a bool query with 2 should clauses with 2 function score query
encapsulating the same simple query string queries.
The following query returns me 2 results for my data set:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
}, {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

Where as if I use the following query I get 5 results, same simple query
strings but with function scores:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
},
"boost_factor" : 1.5
}
}, {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
},
"boost_factor" : 2.5
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

From my understanding of how the should clause works I was expecting both
the queries to return 5 results but I am not able to understand why the 1st
query returns me 2 results for my data set. The "content.name_enu.simple"
uses a simple analyzer, whereas simple_with_numeric uses whitespace
tokenizer and lowercase filter

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0e31e1c7-8b07-4220-abc9-c520d681495a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The function score should not affect relevancy, only the scoring, so the
number of results should not differ. Strange.

Perhaps you do not need to use a function score. With the simple query
string, you can append the boost parameter to the field name:

"simple_query_string": {
"query": "128",
"fields": [
"content.name_enu.simple^1.5"
]
}

Since your example query is just a simple term and not a Lucene query, you
should probably use a match query, which is a boostable query.

Cheers,

Ivan

On Tue, Aug 26, 2014 at 4:15 PM, Akshay Shukla akshayshukla.as@gmail.com
wrote:

I am trying to add a custom boost to the different should clauses in the
bool query, but I am getting different number of results when I use the
bool query with 2 should clauses containing 2 simple query string query vs
a bool query with 2 should clauses with 2 function score query
encapsulating the same simple query string queries.
The following query returns me 2 results for my data set:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
}, {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

Where as if I use the following query I get 5 results, same simple query
strings but with function scores:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
},
"boost_factor" : 1.5
}
}, {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
},
"boost_factor" : 2.5
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

From my understanding of how the should clause works I was expecting both
the queries to return 5 results but I am not able to understand why the 1st
query returns me 2 results for my data set. The "content.name_enu.simple"
uses a simple analyzer, whereas simple_with_numeric uses whitespace
tokenizer and lowercase filter

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0e31e1c7-8b07-4220-abc9-c520d681495a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0e31e1c7-8b07-4220-abc9-c520d681495a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAEJyMOonB38jqQiWQ_17mU%3DGSdkUqz0ctQ6OR8yywoWg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Forgot to add that since your search term is the same, besides using a
match query, you can also use a multi match query. Your queries would be
easier to read.

Not sure why your original query is not working. If you post some example
documents and mapping, but others might be able to figure it out.

--
Ivan

On Wed, Aug 27, 2014 at 11:07 AM, Ivan Brusic ivan@brusic.com wrote:

The function score should not affect relevancy, only the scoring, so the
number of results should not differ. Strange.

Perhaps you do not need to use a function score. With the simple query
string, you can append the boost parameter to the field name:

"simple_query_string": {
"query": "128",
"fields": [
"content.name_enu.simple^1.5"
]
}

Since your example query is just a simple term and not a Lucene query, you
should probably use a match query, which is a boostable query.

Cheers,

Ivan

On Tue, Aug 26, 2014 at 4:15 PM, Akshay Shukla akshayshukla.as@gmail.com
wrote:

I am trying to add a custom boost to the different should clauses in the
bool query, but I am getting different number of results when I use the
bool query with 2 should clauses containing 2 simple query string query vs
a bool query with 2 should clauses with 2 function score query
encapsulating the same simple query string queries.
The following query returns me 2 results for my data set:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
}, {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

Where as if I use the following query I get 5 results, same simple query
strings but with function scores:
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [ {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple" ]
}
},
"boost_factor" : 1.5
}
}, {
"function_score" : {
"query" : {
"simple_query_string" : {
"query" : "128",
"fields" : [ "content.name_enu.simple_with_numeric" ]
}
},
"boost_factor" : 2.5
}
} ]
}
},
"filter" : {
"bool" : {
"must" : [ {
"term" : {
"securityInfo.securityType" : "open"
}
}, {
"bool" : {
"must" : [ {
"term" : {
"sourceId.sourceSystem" : "jmeter_007971_numeric"
}
}, {
"term" : {
"sourceId.type" : "file"
}
} ]
}
} ],
"_cache" : true
}
}
}
},
"fields" : [ "elementId", "sourceId.id", "sourceId.type",
"sourceId.sourceSystem", "sourceVersion", "content.name_enu" ]
}

From my understanding of how the should clause works I was expecting both
the queries to return 5 results but I am not able to understand why the 1st
query returns me 2 results for my data set. The "content.name_enu.simple"
uses a simple analyzer, whereas simple_with_numeric uses whitespace
tokenizer and lowercase filter

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0e31e1c7-8b07-4220-abc9-c520d681495a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0e31e1c7-8b07-4220-abc9-c520d681495a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBM2KityjY%2BS2FtqHEtVuJKfokm6Z15gy8_VgXu4h2mLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.