How to search aggregations in ES?

I have a books index which contains an array of tags (with both text/keyword types), i'd like to offer an autocomplete for tags so users type "ro" and it returns "romance" or "rock and roll".

Here's my mapping:

/books {
 ...
  tags: {
    type: 'text',
    field: {
      keyword: {type: 'keyword'}
    }
  }
}

Example book

{ name: "foo", tags: ['romance', 'story', 'fiction'] }

My aggregation for tags:

      {
      size: 0,
      aggregations: {
        options: {
          terms: {
            field: `tags.keyword`,
            size: 20
          }
        }
      }

The problem is i get other tags not containing "ro" but they simply happened to be in same documents that have other "ro" tags

How can I only get all distinct tags that match "ro"?

This is not entirely trivial to do with the aggregations API. Have you considered using the completion suggester instead? It was designed for this exact use case.

Is it better to use completion suggester or Edge ngrams tokenizer for autocomplete process?

The advantage of the completion suggester is that it will load the suggestions in memory. As a result, the suggestions will be returned really quickly; typically faster than a query using edge ngrams would do.

I failed using Highlighting in Completion Suggestion, Comparison of different methods for autocompletion also mentions that Completion Suggester fails to support Highlighting. Is there a way to overcome this issue?

I don't think the completion suggester supports highlighting. If you need that, you may be better off with a search-based approach.

1 Like

Edge ngram fails to suggest when searching with terms containing symbols. I know whitespace analyzer will solve this issue while using it in different query. how do i mention "whitespace" tokenizer within edge ngram custom analyzer.

The documentation has an example of how to configure a custom analyzer: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html#_example_configuration_5

Just replace "tokenizer": "standard" in that example with "tokenizer": "whitespace" (and of course apply the filters that you want to use instead of those in the example).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.