Setting up autocomplete to return tags, not documents


(Graham Ashton) #1

Hi. I could do with a bit of help getting a list of tags (as defined by my
app's users) that match the text they've already typed.

When a user starts typing the name of a tag I want to suggest the names of
all the tags that contain the string that they've typed. If some of my
documents are tagged with "foo", "bar" or "foo-bar" I'd like to suggest:

  1. "foo" and "foo-bar" if they type "fo", or
  2. "bar" and "foo-bar" if they type "ba"

I've put all the data, mappings, queries and index settings in this gist:

tl;dr - I'd like to remove the "key": "bar" bucket from results.js.

Here's the long version.

I've seen the Completion Suggester, but as I understand it, it won't cope
with (2).

So I've started down the route of indexing my tags as ngrams, as described
in the Definitive Guide. Searching on the ngrams works really well for
suggesting documents that have been tagged with a given tag (e.g. "fo"
finds "foo-bar", etc.), but...

...I want to return to the user a list of tags, rather than a list of
documents that are tagged with a tag that contains the ngram the user has
typed in.

I asked how to do this on IRC a while ago, and @dadoonet said I needed
"term aggregation" (thanks!). I've now got a query that scopes my
aggregation so that I can search for tags by ngrams.

I've actually indexed the tags on my documents twice; once with the
whitespace analyzer, and once with a custom analyzer that sets up ngram
indexing.

It works fine until a document is tagged with multiple tags. If I tag a
document with "foo" and "bar", both those tags are returned when I search
for tags that contain the ngram "fo".

Why? The aggregation finds all documents that have a tag containing the
ngram "fo", and then returns all the tags on that set of documents.

Is there any way I can filter the buckets returned, so that I only get
buckets whose keys match the ngram I've searched for? Or am I approaching
this all wrong...

I'm using Elasticsearch 1.2.1, Java 7, and have been running these tests on
my development machine.

Thanks in advance!

Graham

P.S. The Definitive Guide is superb - it makes Elasticsearch the most
thoroughly documented open source project I've tried learning in many years.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43f539fe-cc2e-4c26-bb2e-fe958ccfe323%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Nitin Maheshwari) #2

I will be interested to know the approach for this as well. THanks.

On Wednesday, 27 August 2014 18:48:10 UTC+5:30, Graham Ashton wrote:

Hi. I could do with a bit of help getting a list of tags (as defined by my
app's users) that match the text they've already typed.

When a user starts typing the name of a tag I want to suggest the names of
all the tags that contain the string that they've typed. If some of my
documents are tagged with "foo", "bar" or "foo-bar" I'd like to suggest:

  1. "foo" and "foo-bar" if they type "fo", or
  2. "bar" and "foo-bar" if they type "ba"

I've put all the data, mappings, queries and index settings in this gist:
https://gist.github.com/gma/caef19e6271aec0f9e56

tl;dr - I'd like to remove the "key": "bar" bucket from results.js.

Here's the long version.

I've seen the Completion Suggester, but as I understand it, it won't cope
with (2).

So I've started down the route of indexing my tags as ngrams, as described
in the Definitive Guide. Searching on the ngrams works really well for
suggesting documents that have been tagged with a given tag (e.g. "fo"
finds "foo-bar", etc.), but...

...I want to return to the user a list of tags, rather than a list of
documents that are tagged with a tag that contains the ngram the user has
typed in.

I asked how to do this on IRC a while ago, and @dadoonet said I needed
"term aggregation" (thanks!). I've now got a query that scopes my
aggregation so that I can search for tags by ngrams.

I've actually indexed the tags on my documents twice; once with the
whitespace analyzer, and once with a custom analyzer that sets up ngram
indexing.

It works fine until a document is tagged with multiple tags. If I tag a
document with "foo" and "bar", both those tags are returned when I search
for tags that contain the ngram "fo".

Why? The aggregation finds all documents that have a tag containing the
ngram "fo", and then returns all the tags on that set of documents.

Is there any way I can filter the buckets returned, so that I only get
buckets whose keys match the ngram I've searched for? Or am I approaching
this all wrong...

I'm using Elasticsearch 1.2.1, Java 7, and have been running these tests
on my development machine.

Thanks in advance!

Graham

P.S. The Definitive Guide is superb - it makes Elasticsearch the most
thoroughly documented open source project I've tried learning in many years.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c335b09b-67fd-4c55-9ed9-ad53f0ec3cd0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3