From the sample doc and query below I seem to be getting different scores when I change the In_order attribute. My question is why does the PhraseFreq change from 0.16666667 when In_order is set to false to 0.25 when set to true.
From what I have gathered the PhraseFreq is calculated by 1/(dist+1).
In both queries the dist would be 3 hence a PhraseFreq of 1/4 = 0.25. This explains the PhraseFreq when In_order is set to true but not when it is set to false.
Example doc:
PUT testindex/type/1
{
"text":"a b c d e f g h i"
}
Query With In_order attribute set to false:
GET testindex/type/_search
{
"explain": true,
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "text" : "a" } },
{ "span_term" : { "text" : "c" } },
{ "span_term" : { "text" : "f" } }
],
"slop" : 100,
"in_order" : false,
"collect_payloads" : false
}}
}
RESULT:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.11744264,
"hits": [
{
"_shard": 3,
"_node": "qGVR96XTSdORS8pnRjHr2g",
"_index": "test",
"_type": "type",
"_id": "1",
"_score": 0.11744264,
"_source": {
"text": "a b c d e f g h i"
},
"_explanation": {
"value": 0.11744264,
"description": "sum of:",
"details": [
{
"value": 0.11744264,
"description": "weight(spanNear([text:a, text:c, text:f], 100, false) in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.11744264,
"description": "fieldWeight in 0, product of:",
"details": [
{
"value": 0.4082483,
"description": "tf(freq=0.16666667), with freq of:",
"details": [
{
"value": 0.16666667,
"description": "phraseFreq=0.16666667",
"details": []
}
]
}
...
Query with In_order attribute set to true:
GET testindex/type/_search
{
"explain": true,
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "text" : "a" } },
{ "span_term" : { "text" : "c" } },
{ "span_term" : { "text" : "f" } }
],
"slop" : 100,
"in_order" : true,
"collect_payloads" : false
}}
}
RESULT:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.14383726,
"hits": [
{
"_shard": 3,
"_node": "qGVR96XTSdORS8pnRjHr2g",
"_index": "test",
"_type": "type",
"_id": "1",
"_score": 0.14383726,
"_source": {
"text": "a b c d e f g h i"
},
"_explanation": {
"value": 0.14383726,
"description": "sum of:",
"details": [
{
"value": 0.14383726,
"description": "weight(spanNear([text:a, text:c, text:f], 100, true) in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.14383726,
"description": "fieldWeight in 0, product of:",
"details": [
{
"value": 0.5,
"description": "tf(freq=0.25), with freq of:",
"details": [
{
"value": 0.25,
"description": "phraseFreq=0.25",
"details": []
}
]
...