Combining Multiple Function Scores

I've got several different Elasticsearch function_score but I'm not sure how to combine them

This is the test set I'm looking at (I added comments to be able to refer to specific items in the question, these comments are not actually in the index)

[
    { // Item 1
        "priority": 0.7,
        "classification": [
            {
                "feature": "A",
                "confidence": 0.4
            },
            {
                "feature": "C",
                "confidence": 0.3
            },
            {
                "feature": "B",
                "confidence": 0.6
            }
        ]
    },
    { // Item 2
        "priority": 0.8,
        "classification": [
            {
                "feature": "A",
                "confidence": 0.3
            },
            {
                "feature": "C",
                "confidence": 0.6
            }
        ]
    },
    { // Item 3
        "priority": 0.4,
        "classification":  [
            {
                "feature": "D",
                "confidence": 0.6
            },
            {
                "feature": "C",
                "confidence": 0.8
            }
        ]
    }
]

Now assume I want to score items with the following weights:

  • "A" with weight of 2
  • "B" with weight of 3

I would like to do the following:

  1. Calculate average confidence for each item only for features "A" and "B" (e.g. average confidence of 0.5 for item 1)
  2. Calculate priority for each item (e.g. popularity of 0.8 item 2)
  3. Calculate the sum of weights for each item feature (if item has feature "A" it receives a weight of 2, if it has feature "B" it receives a weight of 3. e.g. item 1 would receive a weight of 5 and item 2 a weight of 2)
  4. Combine the different calculations into a final score (I will need to experiment a little bit with different "combine" functions)

I know how to create the function_score for the average confidence, it would be something like this:

{
  "nested": {
    "path": "classification",
    "query": {
       "function_score": {
          "functions": [
              {
                  "field_value_factor": {
                      "field": "classification.confidence",
                      "missing": 0
                  },
                  "weight": 0
              }
          ],
          "query": {
              "terms": {
                  "classification.feature": [
                      "A",
                      "B"
                  ]
              }
          },
          "score_mode": "avg"
        }
    }
  }
}

I also know how to create the function score for the priority field, it would be something like this:

{
    "function_score": {
        "functions": [
            {
                "field_value_factor": {
                    "field": "popularity",
                    "missing": 0
                },
                "weight": <some-weight>
            }
        ],
        "score_mode": "sum"
    }
}

I think (but not sure) I know how to create the function score for the sum of feature weights (ignoring weights for features that don't match "A" or "B"). It would probably be something like this:

{
  "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        { "match": { "classification.feature": "A" } },
                        { "match": { "classification.feature": "B" } }
                    ]
                }
            },
            "functions": [
              {
                  "filter": { "match": { "classification.feature": "A" } },
                  "weight": 2
              },
              {
                  "filter": { "match": { "classification.feature": "B" } },
                  "weight": 3
              },
            ],
            "score_mode":"sum"
        }
    }
}

But I have no idea how to combine these 3 different function score (I'm currently not sure what would be the actual combine function. I will need to play with different functions and decide which one works best for me but for the question sake we can say I would like to do average on the results of my 3 function_score)

And so my questions are:

  1. Is it possible to define multiple function_score and then define how to combine them?
  2. If it's not possible to combine multiple function_score what approach should I take in order to solve this issue? (I'm not fixated on using 3 different function_score but not sure how to do it otherwise)
  3. Although I said I want to do average on all the function_score results I may later want to do something a
    bit more complicated like this: score("popularity") + (score("feature-weight") * score("confidence")) - is
    there a way to achieve this?

I'm currently testing this on ES 2.4.5 (which I know is deprecated). We are going to upgrade pretty soon anyway but:

  • Is it only possible to achieve with later ES versions?
  • Even if its only possible in later ES versions I would still like to know how to accomplish it (and use it after we upgrade)

Googling this didn't result in any useful information

Thanks in advance

1 Like

I'm not sure I understand what you're trying to do. Maybe you can explain exactly what the resulting score of for example the first document would be, and how it would be calculated, so we can help you figure out how best to do that. :slightly_smiling_face:

You are totally right, I completely rewrote the question to make it easier to understand (hopefully)
Let me know if there's any information needed

Would you be open to restructuring your documents? You're currently using nested objects. While not impossible, it's hard to flexibly combine scores from different nested objects. If you would create flat documents instead, then what you're trying to do is going to be much easier.

For example, if you indexed your documents like this:

PUT my_index/doc/1
{
  "priority": 0.7,
  "classification": [
    {
      "A": 0.4
    },
    {
      "C": 0.3
    },
    {
      "B": 0.6
    }
  ]
}

PUT my_index/doc/2
{
  "priority": 0.8,
  "classification": [
    {
      "A": 0.3
    },
    {
      "C": 0.6
    }
  ]
}

PUT my_index/doc/3
{
  "priority": 0.4,
  "classification": [
    {
      "D": 0.6
    },
    {
      "C": 0.8
    }
  ]
}

Then all you need is a relatively straightforward script_score function to calculate the score that you want:

GET my_index/_search
{
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "script_score": {
            "script": "def popularity = doc['priority'].value; def feature_weight = 0.0; feature_weight += (doc['classification.A'].size() != 0 ? 2 : 0 ); feature_weight += (doc['classification.B'].size() != 0 ? 3 : 0 ); def confidence = 0.0; if (!(doc['classification.A'].size() == 0 && doc['classification.B'].size() == 0)) { confidence = (doc['classification.A'].value + doc['classification.B'].value) / (doc['classification.A'].size() + doc['classification.B'].size())}; return popularity + (feature_weight * confidence);"
          }
        }
      ]
    }
  }
}

(You will need to configure script.inline: true in your elasticsearch.yml configuration file to run inline scripts in version 2)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.