I'm trying to get started in using Groovy scripts, and I'm having some trouble understanding scope/limitations of the Groovy "API." At a high level, I'm trying to build queries that return what could be called "multi-score" queries (e.g., "document 1 scored 80% for relevance to your query and 90% for popularity/quality, giving a combined score of 85%" where all three numbers appear in the search results). To do this, I've been experimenting with computing the component scores using
script_fields, and an overall score using
Here's the main issues I've been running into thus far:
- Is there any way to access the
inner_hits data from a
nested query from within Groovy script?
- Is there any way to access script fields within a
I've been doing more experiments, and have found a few things out:
- I'd assumed that
script_fields code was executed before the
_score is not available as a variable inside the
script_fields code. This turned out to be false:
script_fields is executed on only the hits that are returned, and is executed in serial, e.g., adding
script_fields will result in 25000ms of additional delay when there are 50 hits. Naturally, this means that it will be impossible to use script fields in score calculations.
script_score is parallelized, e.g., on a 350k-document 5-shard index, adding
script_score resulted in a roughly 87s delay, not a 350s delay. -- I'm not sure why I got ~4x speedup rather than a 5x one.
- I also experimented with using python instead of groovy, which allowed me to poke at the available locals (in
script_fields anyways). There don't seem to be any variables related to inner hits. The list of variables is
['_CACHE', '_FREQUENCIES', '_OFFSETS', '_PAYLOADS', '_POSITIONS', '_doc', '_fields', '_index', '_source', 'doc'], and in
script_score there's the added variable
Hmm, yes I'm also looking for a way to access the inner hits inside a groovy script.
Did you already find something?
I skimmed a bit of the python-script extension's source code, and couldn't see any sign that it was selectively including some variables, but not others. Thus, I'm inclined to believe that inner hits are not accessible from within a groovy script.
It might be possible to fork and modify ES to support this feature (I haven't yet thought of a reason why it can't be done, anyways), but that's too big a side-project for my current team. In the end, it was easiest to just recompute the inner hits from
doc inside of
script_fields. Not sure whether that's much help for you project or not.