I'm trying to boost the _score
of documents by using the value (double) that is present in a dynamic field in a nested structure.
This is (part of) my mapping:
{
"_doc": {
...
"dynamic": "strict",
"properties": {
"raw": {
"type": "keyword",
"index": false,
"ignore_above": 0
},
"indexed": {
"type": "text",
"term_vector": "yes",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "standard",
"fields": {
"raw": {
"type": "keyword",
"index": true
}
}
},
...
"data": {
"type": "nested",
"dynamic": "strict",
"properties": {
"key": {"type": "keyword"},
"name": {"type": "keyword"},
"type": {"type": "keyword", "index": false},
"value_string": {"type": "keyword"},
"value_double": {"type": "double"},
"value_boolean": {"type": "boolean"},
"value_date": {"type": "date"}
}
},
"filtered": {
"type": "nested",
"dynamic": "strict",
"properties": {
"key": {"type": "keyword"},
"name": {"type": "keyword"},
"value": {
"type": "text",
"analyzer": "folding_analyzer"
},
"raw": {
"type": "text",
"analyzer": "kw_lowercase_analyzer"
}
}
}
}
}
}
The field is available in the nested data
structure.
Here's a part of (due to char limits on this forum) example doc:
...
"_source":{
...
"data":[
{
"key":"84f2c",
"name":"description",
"value_string":"Foo bar",
"type":"string"
},
...
{
"key":"672c5",
"name":"views",
"value_double":18,
"type":"double"
}
],
"filtered":[
{
"key": "84f2c",
"name": "description",
"value": "foo bar",
"raw": "foo bar"
},
...
],
"indexed":"foo bar",
...
}
...
The goal is to factor in views
(=18 in this example) as a boosting value (through log1p for example to smoothen out big values)
Here's my (simplified) query:
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": [
{
"term": {
"collection_id": "5bf6f8c51cd759010c0e70d4"
}
}
],
"must": [
{
"dis_max": {
"queries": [
{
"match": {
"indexed": {
"boost": 1,
"zero_terms_query": "all",
"query": "foo",
"minimum_should_match": "75%"
}
}
},
{
"nested": {
"path": "filtered",
"query": {
"bool": {
"must": [
{
"term": {
"filtered.name": {
"value": "description"
}
}
},
{
"match": {
"filtered.value": {
"query": "foo",
"boost": 6,
"fuzziness": 0
}
}
}
]
}
}
}
},
{
"nested": {
"path": "filtered",
"query": {
"term": {
"filtered.raw": {
"value": "foo",
"boost": 10
}
}
}
}
}
],
"tie_breaker": 0.3
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "data.value_double",
"factor": 2,
"modifier": "None",
"missing": 1
},
"filter": {
"nested": {
"path": "data",
"query": {
"term": {
"data.key": "672c5"
}
}
}
}
}
],
"boost_mode": "replace"
}
},
"_source": {
"includes": [
"data"
]
},
"track_scores": true
}
The query does a dis_max on three (in this example) queries:
- A query on the indexed field (a combined field that is the result of concatenation of multiple fields and has the most analyzers on it)
- A bool query on a specific field with a match on the value of that nested document with some relative simple analyzers (filtered)
- A big boost if we find a exact match in the raw part of a field
The gist is in the functions
part of the query.
(I've used boost_mode=replace as debug to quickly see if the document is getting a score that is equal to the field)
The resulting docs aren't getting the score of the field but uses the missing/fallback value (=1). I'm suspecting that or the filter isn't doing it's job or that the reference to that specific field by using data.value_double is failing.
How can I construct the function in such a way that it can capture the value that is coming out of data.value_double
and use it as a scoring factor?