Query autocomplete options

Hi,

I am currently working on implementing autocomplete functionality. So far I have looked into the following options:

  • Completion suggester. We would like query-time ranking such that we can e.g. base it on term frequency, which does not seem to be possible with the completion suggester.

  • NGram Tokenizer. If I understand correctly, this would greatly increase the index size.

  • match_phrase_prefix. Is described as a poor man's autocomplete, but returns a full hit list like a regular search request.

Are there any other options, or should I look into writing a plugin? Ideally, I would like a phrase suggester that not only gives spelling suggestions but also autocomplete suggestions.

Hey,

last week a new search_as_you_type datatype got merged. However this will only be available in Elasticsearch 7.1 and above so it will take some time until you can use it in an released ES version. You can read more about it here https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-as-you-type.html

--Alex

Looks nice, is there any way to experiment with this datatype until it is released?

You could either build the 7.x branch locally by running ./gradlew assemble or download a snapshot (note, this link will not be updated with newer snapshots)

https://snapshots.elastic.co/7.1.0-6e08fd97/downloads/elasticsearch/elasticsearch-7.1.0-SNAPSHOT-linux-x86_64.tar.gz
https://snapshots.elastic.co/7.1.0-6e08fd97/downloads/elasticsearch/elasticsearch-oss-7.1.0-SNAPSHOT-windows-x86_64.zip
https://snapshots.elastic.co/7.1.0-6e08fd97/downloads/elasticsearch/elasticsearch-7.1.0-SNAPSHOT-darwin-x86_64.tar.gz

Thanks, I have experimented with it for a bit and I like the suggestions it gives. Is there a way to take multiple documents into account? Let's say "test" occurs many times in an index and "testing" occurs only once, could it rank "test" higher than "testing" in that case?

Not sure I understand the 'taking multiple documents into account' part. Scoring works like a regular search, so each document gets scored against the query.

You however have all the power of a boolean query on top of this, so you could play around with some should queries to influence the score for example.

--Alex

I was trying to say that I would like the scoring of a suggestion to take into account term frequencies over the entire index, so it would give terms with higher frequency a higher ranking.
But since this is a regular search that results in a hit list of documents (and not a list of ranked terms/suggestions), I guess this may only be possible using aggregations.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.