Score influenced by field value (terms_set query)

Hello

I'm new to elasticsearch. I really like it, but couldn't find a solution to one specific use case so far (been looking for quite a while):

Part of my mapping:

entities: {
  properties: {
    term: {
      type: 'keyword'
    },
    type: {
      type: 'keyword'
    },
    salience: {
      type: 'float'
    }
  }
}

Data looks like this:

{
    "term": "class",
    "type": "OTHER",
    "salience": 0.30540481209754944
},
{
    "term": "reproach",
    "type": "OTHER",
    "salience": 0.1406273990869522
},
{
    "term": "work",
    "type": "OTHER",
    "salience": 0.1406273990869522
}

I'm using a terms_set query to find the best matches based on a random number of terms. This works well and the score is reasonable.

However, I would like to calculate a new score that takes the "salience" field into account. So, if my query is only for "class", I'd like to get the document where this field has a higher "salience" first. The formula could be something like "_score + sum([salience of matches]) * [some_factor]", for example. My best idea so far was by means of the painless score context (and then two loops to compare the input terms with the available "entities"... and some hope it won't be too slow), but being that I couldn't find a way to access the original input terms anyway, I didn't get anywhere with this.

Is this possible? If so, how should I approach it? If required, changing my structure wouldn't be a problem.

For completeness' sake, here's my current query:

GET library/_search
{
	"query": {
		"function_score": {
			"query": {
				"terms_set": {
					"entities.term": {
						"terms": ["work", "voice", "errors", "impressions"],
						"minimum_should_match_script": {
							"source": "1"
						}
					}
				}
			},
			"script_score": {
				"script": {
					"source": "return _score"
				}
			}
		}
	},
	"size": 100
}

I'm using version 7.0.1.

Thank you!

Finding matches and calculating scores based on them in painless sounds inefficient.
If you don't have many terms you can take advantage of should clauses of bool query and constant score where boost will be your salience field, something like this:

{
    "query": {
        "bool": {
            "should": [
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "term": "class"
                            }
                        },
                        "boost" : 0.30540481209754944
                    }
                },
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "codes": "reproach"
                            }
                        },
                        "boost" : 0.1406273990869522
                    }
                }
            ]
        }
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.