How does span near scoring work

symphony · June 16, 2025, 6:28am

Hi, I'm facing an issue with the keyword search I'm building where documents seem to rack up a huge score out of nowhere and overtake documents that seem to be more relevant. For example, if the query is "quick brown fox", huge wordy documents where seemingly the only somewhat relevant word is "for" (due to fuzzy matching) manages to overtake documents that have exact word matchings with either the entirety or part of "quick brown fox".

I believed the issue could have been due to the repeated "for" matching against the query. However, when I tested this by creating a document with many repetitions of the word "fox", it did not overtake the documents with "quick brown fox".

I'd appreciate any insights into how the scoring works for span near queries or guidance on how to prevent documents with weak matches from overpowering higher-quality matches.

I'm using a bool query over multiple span_near queries (with span_term) to accommodate partial matches. The query is split into segments to support documents that may not contain all the words in the original phrase. Each segment combination is wrapped in a span_near and passed into the should clause of the bool query. This will be how the search query look like when the query is "quick brown fox".

GET /connector-test-8.18-connector-new/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "quick", "fuzziness": "auto" } } } } }
            ],
            "slop": 1,
            "in_order": true,
            "boost": 1
          }
        },
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "quick", "fuzziness": "auto" } } } } },
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "brown", "fuzziness": "auto" } } } } }
            ],
            "slop": 2,
            "in_order": true,
            "boost": 2
          }
        },
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "quick", "fuzziness": "auto" } } } } },
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "brown", "fuzziness": "auto" } } } } },
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "fox", "fuzziness": "auto" } } } } }
            ],
            "slop": 3,
            "in_order": true,
            "boost": 3
          }
        },
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "fox", "fuzziness": "auto" } } } } }
            ],
            "slop": 1,
            "in_order": true,
            "boost": 1
          }
        },
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "brown", "fuzziness": "auto" } } } } }
            ],
            "slop": 1,
            "in_order": true,
            "boost": 1
          }
        },
        {
          "span_near": {
            "clauses": [
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "brown", "fuzziness": "auto" } } } } },
              { "span_multi": { "match": { "fuzzy": { "body": { "value": "fox", "fuzziness": "auto" } } } } }
            ],
            "slop": 2,
            "in_order": true,
            "boost": 2
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

Kathleen_DeRusso · June 16, 2025, 12:11pm

Hey there @symphony and welcome to the community!

You may be interested in checking out the dis_max query, I think this will meet your needs for scoring relevance smoothing.

Topic		Replies	Views
spanNear queries Vs phrase match query in elasticsearch Elasticsearch	2	679	March 5, 2019
Closer is better Elasticsearch	4	678	October 10, 2017
Proximity between multiple (boolean) queries Elasticsearch	2	661	July 5, 2017
Nested Span Near Queries Give Results That Make No Sense Elasticsearch	5	1030	July 6, 2017
Troubles with complex span query term boosting Elasticsearch	2	593	February 23, 2018

How does span near scoring work

Related topics