How to search aggregations in ES?

(Redkin) #1

I have a books index which contains an array of tags (with both text/keyword types), i'd like to offer an autocomplete for tags so users type "ro" and it returns "romance" or "rock and roll".

Here's my mapping:

/books {
  tags: {
    type: 'text',
    field: {
      keyword: {type: 'keyword'}

Example book

{ name: "foo", tags: ['romance', 'story', 'fiction'] }

My aggregation for tags:

      size: 0,
      aggregations: {
        options: {
          terms: {
            field: `tags.keyword`,
            size: 20

The problem is i get other tags not containing "ro" but they simply happened to be in same documents that have other "ro" tags

How can I only get all distinct tags that match "ro"?

(Abdon Pijpelink) #2

This is not entirely trivial to do with the aggregations API. Have you considered using the completion suggester instead? It was designed for this exact use case.

(Pranav) #3

Is it better to use completion suggester or Edge ngrams tokenizer for autocomplete process?

(Abdon Pijpelink) #4

The advantage of the completion suggester is that it will load the suggestions in memory. As a result, the suggestions will be returned really quickly; typically faster than a query using edge ngrams would do.

(Pranav) #5

I failed using Highlighting in Completion Suggestion, Comparison of different methods for autocompletion also mentions that Completion Suggester fails to support Highlighting. Is there a way to overcome this issue?

(Abdon Pijpelink) #6

I don't think the completion suggester supports highlighting. If you need that, you may be better off with a search-based approach.

(Pranav) #7

Edge ngram fails to suggest when searching with terms containing symbols. I know whitespace analyzer will solve this issue while using it in different query. how do i mention "whitespace" tokenizer within edge ngram custom analyzer.

(Abdon Pijpelink) #8

The documentation has an example of how to configure a custom analyzer:

Just replace "tokenizer": "standard" in that example with "tokenizer": "whitespace" (and of course apply the filters that you want to use instead of those in the example).

