Terms aggregation on named queries

jonsgreen · October 10, 2016, 5:01am

I was hoping to be able to do a terms aggregation on the 'matched_queries' field that is generated when doing a named query. Since the matched_queries is added outside the document it is not accessible as a field for a terms aggregation at this time. I am curious if this could be a feature request or is it just not possible due to how aggregations are built.

Mark_Harwood · October 10, 2016, 10:56am

I had a similar disappointment when I discovered I couldn't use these names in terms aggs.

Unfortunately this information is only derived at the fetch phase for individual docs not inline in the collect phase when aggs run.

I am curious if this could be a feature request

It is likely to require changes to core Lucene. My view was that Lucene Query clauses currently gather only a score for each stream of matching docs. Maybe, like the Lucene tokenization API [1] , additional metadata could optionally be emitted via a search equivalent of TokenStream Attribute objects.

This would allow each doc to have arbitrary "match metadata" attached as is if they were properties of the document. A name/tag is an example of one piece of metadata e.g. your Boolean OR query that looks for terms elasticsearch, logstash or kibana could associate the user-supplied tag elasticstack for use in aggs. As well as specifically tagging a user-defined category like this you could attach a numeric measure of belonging to a category e.g. music-listener profiles could be ranked on their "death-metal-ness" or "jazz-ness" by supplying lists of bands in these genres and returning the number of band names a user matched in each query clause. These numbers provide a level of "about-ness" which could be plotted in a histogram agg etc.

Some of this is achievable today if you mess around with boosts, constant_score and scripted aggs to smuggle metadata out in Lucene's single float score but it is a less than ideal way of getting at details behind the Lucene matching logic.

[1] TokenStream (Lucene 5.3.1 API)

Topic		Replies	Views
Named Query Confusion Elasticsearch	2	416	June 7, 2021
Terms Aggregation Restricted to Top Matches? Elasticsearch	2	347	July 6, 2017
How to get Elasticsearch terms aggregation to filter multi value fields by the same method as an aggs filter? Elasticsearch	1	972	October 29, 2019
Elasticsearch ranking aggregation with multiple terms query Elasticsearch	2	445	October 24, 2019
Faceted search with aggregations Elasticsearch	21	1766	December 12, 2019

Terms aggregation on named queries

Related topics