First post here, so hello, nice to meet you all.
I am playing with aggregations and getting some great results but I have one particular use case where I don't have an optimal result yet and I was hoping for a steer.
Each document has a locality which is an array structured like this to support hierarchical searching:
The numeric prefix is a depth indicator that is used when filtering.
I am creating an auto suggest of available localities, when some types in "rea" I only want to return this result in the aggregation bucket:
However, in order make the autocomplete suggestions work I am using a pattern tokenizer so I can search only the lowest fragment in the hierarcy:
'tokenizer' => array(
'descendent' => array(
'type' => 'pattern',
'pattern' => '([^/]+$)', // Get everything after the last slash
'group' => 1
I am using a edge n-gram filter so I can do a "starts with" like search:
'edge_ngram_filter' => array(
'type' => 'edge_ngram',
'min_gram' => 2,
'max_gram' => 20,
And then I put that together in my analyzer:
'edge_ngram_analyzer_descendent' => array(
'type' => 'custom',
'tokenizer' => 'descendent',
'filter' => array(
But the difficulty is that I need to return a value from a different field in my aggregation, the untouched version. If I setup my aggregation like this:
Then I will receive all items from the locality array in the document, not just the one that matches.
And I can't use an include on the aggregation like:
because if I do then I have to do this on one of my analyzed fields to account for the lowercasing and extracting the lowest descendent.
What direction should I be looking in to solve this?
Many thanks in advance