Hi,
During the last Paris Elastic On , I've met Adrien & Jim to talk about our usecase.
In our platform we use ES to store hierarchical data and nor ES nor Lucene are very good with this kind of documents.
So I wanted their advices about
- creating a "reverse-children" aggregation, that should help us
- creating an expression script langage that can take care of the document "children", with good performances (rely on docValues instead of sources)
Since ES'On, I had some success with theses tasks, but I had to dig deep into Lucene level to have good perfs.
Especially I had to rely on globalordinal maps &/or BitSet, that are supposed to be cached
I didn't find any clean solution to use the ES infrastructure that is in charge of theses globalOrdinal & BitSetFilter caches.
So I've had to patch Elasticsearch (in v6.2.3 actually) to benefit from theses caches:
Actually, my patch consists of
A modification of the FilterScript & SearchScript Factories to access QueryShardContext:
/** A factory to construct stateful {@link SearchScript} factories for a specific index. */
public interface Factory {
LeafFactory newFactory(Map<String, Object> params, SearchLookup lookup, QueryShardContext queryShardContext);
}
This one make the queryshardcontext (and so the globalordinal/bitsetfilter caches) accessible from scripts. There is ~15 impacts in server & modules code.
Some map, and its accessor, on QueryShardContext
public class QueryShardContext extends QueryRewriteContext {
private final Map<String, Object> freeMap;
This one is useful for me to use it as a cache for scripts that are used in different contexts in the same query (ex : sorting, filtering & aggregation)
This ES server patching will be a pain for us (in term of maintenance, upgrade etc..)
Do you ES-folks think that theses modifications should be integrated in ES ?
Or may be it's a strongly desired design that scripts can't access QueryShardContext (may be for security reasons) ?
Or may be there is a better way to use GlobalOrdinal caches in scripts ?
Thanks
Franck