I create a query to ES:
GET my-index/_search
{
"query": {
"nested": {
"inner_hits": {},
"score_mode": "max",
"path": "my_nested_field",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"my_nested_field.value.token_analyzed": {
"query": "Looking for something like this"
}
}
}
]
}
}
]
}
}
}
},
"rescore": {
"my_plugin_name": {
}
}
}
Documents in index are something like:
{
"some_field": "some_value",
"some_other_field": "some_other_value",
"my_nested_field": [
{
"value": "some nested value",
"something_else": "something else"
},
{
"value": "some nested value 2",
"something_else": "something else 2"
}
]
]
}
My custom rescorer plugin is executed and everything is good. I would like to optimize my plugin though. Currently when I hit some document I use every element in my_nested_field
to rescore the top level document. I would like to use only the ones that actually caused the hit for rescoring the top level document. But I don't know how to filter out the ones that did not cause the hit in the plugin.
My current code:
public TopDocs rescore(TopDocs topDocs, IndexSearcher searcher, RescoreContext rescoreContext) throws IOException {
for (int i = 0; i < topDocs.scoreDocs.length; i++) {
Document document = searcher.doc(topDocs.scoreDocs[i].doc);
String json = parserSource(document);
}
...
private String parseSource(Document document) {
return new String(document.getField("_source").binaryValue().bytes, StandardCharsets.UTF_8);
}
The thing that I'm looking for is not in the path _source
, but the only things I can parse like this are _source
and _id
. I expect it's because you can only parse stored fields. But surely there must be somehow I can parse the inner hits scoring results?
In the actual ES response right next to each documents source there is this (but I dont know how to parse this stuff in plugin):
"inner_hits": {
"my_nested_field": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 4.2184687,
"hits": [ // I NEED THIS STUFF NOT THE _source
{
"_index": "my-index",
"_type": "_doc",
"_id": "8b3d929a-8e90-4ce7-aa1e-7f11ec16de1e",
"_nested": {
"field": "my_nested_field",
"offset": 2
},
"_score": 4.2184687,
"_source": {
"value": "Some value which was actually hit",
}
}
]
}
}
}
Side note: I need the full document after I make the query, not just the nested fields.