How to: function_score + script_score + nested doc field?


(James Addison) #1

Hey! It's nearing the end of 2018, and I'm wondering if there's a way to accomplish the following with ES 6.4.

I've got a function_score query with n script_score functions targeting top-level document fields and 1 script_score function targeting a nested doc field.

How can I achieve this in a single function_score query? My understanding is that the nested doc value won't be found in its script score query due to the function_score's query not being nested.

mapping:

{
   "activity-related-20181016-215041": {
      "mappings": {
         "doc": {
            "properties": {
               "events": {
                  "type": "nested",
                  "properties": {
                     "date_range": {
                        "type": "date_range"
                     },
                     "days_of_week": {
                        "type": "short"
                     }
                  }
               },
               "photo": {
                  "type": "keyword"
               },
               "relation": {
                  "type": "join",
                  "eager_global_ordinals": true,
                  "relations": {
                     "location": "activity"
                  }
               },
               "status": {
                  "type": "short"
               },
               "tags": {
                  "properties": {
                     "id": {
                        "type": "long"
                     },
                     "name": {
                        "type": "text",
                        "index": false
                     },
                     "slug": {
                        "type": "keyword"
                     }
                  }
               }
            }
         }
      }
   }
}

a query that has a script_score function targeting the nested value:

{
   "query": {
      "function_score": {
          "score_mode": "sum",
          "boost_mode": "replace",
         "query": {
            "term": {
                "relation": "activity"
            }
         },
         "functions": [
            {
               "script_score": {
                  "script": "for (long x : doc['events.days_of_week'].values) { if (x == 1) {return 100} } "
               }
            },
            {
               "script_score": {
                  "script": "for (long x : doc['tags.id'].values) { if (x == 1) {return 100} } "
               }
            },
            {
               "script_score": {
                  "script": "doc['photo'].value.length() > 0 ? 10 : 0"
               }
            },
            {
               "script_score": {
                  "script": "doc['status'].value >= 50 ? (doc['status'].value / 10 - 4) * 5 : 0"
               }
            }
         ]
      }
   }
}

Notice that the first two functions target events. days_of_week (a nested doc field) and tags.id (just an object field). The latter affects the score, while the former has no impact, due to nesting.

How can I achieve scoring that takes into account both document and nested document field data?


(Mayya Sharipova) #2

No, unfortunately, you can't mix nested and top level docs for the same scoring - even at the end of 2018.

Nested objects are indexed as separate Lucene documents from their top-level document.
Function_score query works at the level of Lucene documents - processing each matched Lucene document at a time.

You can either:

  1. Use a top level query for function_score, which returns only top-level documents and your script_score will calculate scores for them.
  2. OR use a nested query with inner_hits and calculate script_score only for them:
{
  "query": {
    "nested": {
      "inner_hits": {},
      "path": "events",
      "query": {
        "function_score": {
          "script_score": {
            "script": {
              "source": "doc['events.days_of_week'].value"
            }
          }
        }
      }
    }
  }
}

Even if we had a query that will get us all documents - nested and top at the same time, still function_score will process separately one by one, and there would not be a way to access them in the same function.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.