Setting up autocomplete to return tags, not documents

Graham_Ashton · August 27, 2014, 1:18pm

Hi. I could do with a bit of help getting a list of tags (as defined by my
app's users) that match the text they've already typed.

When a user starts typing the name of a tag I want to suggest the names of
all the tags that contain the string that they've typed. If some of my
documents are tagged with "foo", "bar" or "foo-bar" I'd like to suggest:

"foo" and "foo-bar" if they type "fo", or
"bar" and "foo-bar" if they type "ba"

I've put all the data, mappings, queries and index settings in this gist:

gist.github.com

https://gist.github.com/gma/caef19e6271aec0f9e56

aggregated-query.js

// The query that (in the Sense console) produced the contents of results.js

// GET planner_development/_search?search_type=count
{
  "query": {
    "terms": {
      "tag_ngrams": [
        "fo"
      ]
    }

This file has been truncated. show original

data.js

// This file shows the output of a `"match_all": {}` query, so you can see the source data

{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },

This file has been truncated. show original

index-settings.js

// This file shows how I've setup indexing tags as ngrams
//
// curl http://localhost:9200/planner_development/_settings?pretty=true
{
  "planner_development" : {
    "settings" : {
      "index" : {
        "uuid" : "69yzbIUuSemtWCNrCzctKg",
        "analysis" : {
          "analyzer" : {

This file has been truncated. show original

There are more than three files. show original

tl;dr - I'd like to remove the "key": "bar" bucket from results.js.

Here's the long version.

I've seen the Completion Suggester, but as I understand it, it won't cope
with (2).

So I've started down the route of indexing my tags as ngrams, as described
in the Definitive Guide. Searching on the ngrams works really well for
suggesting documents that have been tagged with a given tag (e.g. "fo"
finds "foo-bar", etc.), but...

...I want to return to the user a list of tags, rather than a list of
documents that are tagged with a tag that contains the ngram the user has
typed in.

I asked how to do this on IRC a while ago, and @dadoonet said I needed
"term aggregation" (thanks!). I've now got a query that scopes my
aggregation so that I can search for tags by ngrams.

I've actually indexed the tags on my documents twice; once with the
whitespace analyzer, and once with a custom analyzer that sets up ngram
indexing.

It works fine until a document is tagged with multiple tags. If I tag a
document with "foo" and "bar", both those tags are returned when I search
for tags that contain the ngram "fo".

Why? The aggregation finds all documents that have a tag containing the
ngram "fo", and then returns all the tags on that set of documents.

Is there any way I can filter the buckets returned, so that I only get
buckets whose keys match the ngram I've searched for? Or am I approaching
this all wrong...

I'm using Elasticsearch 1.2.1, Java 7, and have been running these tests on
my development machine.

Thanks in advance!

Graham

P.S. The Definitive Guide is superb - it makes Elasticsearch the most
thoroughly documented open source project I've tried learning in many years.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43f539fe-cc2e-4c26-bb2e-fe958ccfe323%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nitin_Maheshwari · August 28, 2014, 6:17am

I will be interested to know the approach for this as well. THanks.

On Wednesday, 27 August 2014 18:48:10 UTC+5:30, Graham Ashton wrote:

Hi. I could do with a bit of help getting a list of tags (as defined by my
app's users) that match the text they've already typed.

When a user starts typing the name of a tag I want to suggest the names of
all the tags that contain the string that they've typed. If some of my
documents are tagged with "foo", "bar" or "foo-bar" I'd like to suggest:

"foo" and "foo-bar" if they type "fo", or

"bar" and "foo-bar" if they type "ba"

I've put all the data, mappings, queries and index settings in this gist:
aggregated-query.js · GitHub

tl;dr - I'd like to remove the "key": "bar" bucket from results.js.

Here's the long version.

I've seen the Completion Suggester, but as I understand it, it won't cope
with (2).

So I've started down the route of indexing my tags as ngrams, as described
in the Definitive Guide. Searching on the ngrams works really well for
suggesting documents that have been tagged with a given tag (e.g. "fo"
finds "foo-bar", etc.), but...

...I want to return to the user a list of tags, rather than a list of
documents that are tagged with a tag that contains the ngram the user has
typed in.

I asked how to do this on IRC a while ago, and @dadoonet said I needed
"term aggregation" (thanks!). I've now got a query that scopes my
aggregation so that I can search for tags by ngrams.

I've actually indexed the tags on my documents twice; once with the
whitespace analyzer, and once with a custom analyzer that sets up ngram
indexing.

It works fine until a document is tagged with multiple tags. If I tag a
document with "foo" and "bar", both those tags are returned when I search
for tags that contain the ngram "fo".

Why? The aggregation finds all documents that have a tag containing the
ngram "fo", and then returns all the tags on that set of documents.

Is there any way I can filter the buckets returned, so that I only get
buckets whose keys match the ngram I've searched for? Or am I approaching
this all wrong...

I'm using Elasticsearch 1.2.1, Java 7, and have been running these tests
on my development machine.

Thanks in advance!

Graham

P.S. The Definitive Guide is superb - it makes Elasticsearch the most
thoroughly documented open source project I've tried learning in many years.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c335b09b-67fd-4c55-9ed9-ad53f0ec3cd0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
How to search aggregations in ES? Elasticsearch	8	465	October 26, 2018
Implementing Suggestion Completion with multiple words Elasticsearch	1	1525	July 5, 2017
Return distinct tags Elasticsearch	1	466	July 6, 2017
Get matching term Elasticsearch	3	313	July 6, 2017
Autocompletion Elasticsearch	18	943	July 6, 2017

Setting up autocomplete to return tags, not documents

Related topics