Troubles with complex span query term boosting

Hi,

I have a troubles with boosting of complex term span queries to the elasticsearch. Here's the toy example:

Mapping:

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "text": {
          "type": "text",
          "term_vector": "with_positions_offsets"
        }
      }
    }
  }
}

PUT my_index/my_type/1
{
  "text": "quick fox"
}

PUT my_index/my_type/2
{
  "text": "brown fox"
}

And here are sample request and sample explain output. Request:

GET my_index/_search?_source=false
{
  "explain" : true,
  "query" : {
    "span_near" : {
      "slop": 12,
      "in_order" : true,
      "clauses" : [
        {
          "span_or": {
            "clauses": [
              {
                "span_term" : {
                  "text" : {
                    "value" : "quick",
                    "boost" : 2
                  }
                }
              },
              {
                "span_term" : {
                  "text" : {
                    "value" : "brown",
                    "boost" : 1
                  }
                }
              }
            ]
          }
        }, 
        { 
          "span_term" : {
            "text" : {
              "value" : "fox",
              "boost" : 1
            }
          }
        }
      ]
    }
  },
  "size":1500
}

Response:

{
  "took": 68,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.9616584,
    "hits": [
      {
        "_shard": "[my_index][2]",
        "_node": "Q1t0VhAVTOK-OY-9z8eBCA",
        "_index": "my_index",
        "_type": "my_type",
        "_id": "2",
        "_score": 1.9616584,
        "_explanation": {
          "value": 1.9616585,
          "description": "weight(spanNear([spanOr([(text:quick)^2.0, text:brown]), text:fox], 12, true) in 0) [PerFieldSimilarity], result of:",
        }
      },
      {
        "_shard": "[my_index][3]",
        "_node": "Q1t0VhAVTOK-OY-9z8eBCA",
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 1.9616584,
        "_explanation": {
          "value": 1.9616585,
          "description": "weight(spanNear([spanOr([(text:quick)^2.0, text:brown]), text:fox], 12, true) in 0) [PerFieldSimilarity], result of:",
        }
      }
    ]
  }
}

As I can see elasticsearch mentions that it 'sees' boost field from this string from response:

"description": "weight(spanNear([spanOr([(text:quick)^2.0, text:brown]), text:fox], 12, true) in 0) [PerFieldSimilarity], result of:"

but it doesn't affect on the score result. Same request with no boosting parameters at all will lead to the same score result. Note that if I will use simple request like this one:

GET my_index/_search?_source=false
{
  "explain" : true,
  "query" : {
    "span_term" : {
      "text" : {
        "value" : "quick",
        "boost" : 2
      }
    }
  },
  "size":1500
}

I will get expected result. How can I achieve proper boosting on complex span requests?

1 Like

Yes, span queries always ignore boosts but on the outer span query, which is the span_near query in your case.

Span queries can do two things:

  • produce a list of matching spans
  • produce scores

Compound span queries like span near only use the list of matching spans of sub span queries in order to find their own spans, which they then use to produce a score. Scores are never computed on sub span queries, which is the reason why boosts are not applied: they only influence the way scores are computed, not spans.

I opened an issue to fix this: https://github.com/elastic/elasticsearch/issues/28390.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.