Tag based searching

I am running a Rails application where I need to implement search
functionality of users, using elasticsearch. There are many registered
users on the website. User can submit his/her information like college,
company, address, emails, etc. Now I need a common search box whereby,
while entering the data, suggestions are shown based on the text
written. Also I want to show the no. of users associated with a particular
suggestion when it is shown (using facets). The suggestions can be anything

Since I want to use elasticsearch for implementing the above functionality,
I came up with the following structure:

There will be user document which will contain the user_id, and other user
related relevant information. And there will be a tag document which will
have fields as (tag_type and tag_value). tag_type can be company, college,
etc. while tag_value should be the name. The id of the tag document will
the generated from its tag_value and tag_type.

In such a scenario, user can be associated with many tags and also, tags
can be associated to many users.

I don't know how to define the structure in elasticsearch to implement the
above. Parent child relationship can't be applied here. I need to know how
to define such kind of relationships in elasticsearch.

--
Thanks,
Aash

--

faceted search is what you are looking for.
Look into facets in -

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.discover@gmail.comwrote:

There will be user document which will contain the user_id, and other user
related relevant information. And there will be a tag document which will
have fields as (tag_type and tag_value). tag_type can be company, college,
etc. while tag_value should be the name. The id of the tag document will
the generated from its tag_value and tag_type.

--

Can you please give me an example whereby using the above structure, I can
query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

faceted search is what you are looking for.
Look into facets in -
Elasticsearch Platform — Find real-time answers at scale | Elastic

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.discover@gmail.comwrote:

There will be user document which will contain the user_id, and other
user related relevant information. And there will be a tag document which
will have fields as (tag_type and tag_value). tag_type can be company,
college, etc. while tag_value should be the name. The id of the tag
document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure, I can
query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan <vineet...@algotree.com<javascript:>

wrote:

faceted search is what you are looking for.
Look into facets in -
Elasticsearch Platform — Find real-time answers at scale | Elastic

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya <aash.d...@gmail.com<javascript:>

wrote:

There will be user document which will contain the user_id, and other
user related relevant information. And there will be a tag document which
will have fields as (tag_type and tag_value). tag_type can be company,
college, etc. while tag_value should be the name. The id of the tag
document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that, while
computing the facets against "tag_value", the returned values only contain
the facet count and its corresponding "tag_value". In the above result, I
also want the corresponding "tag_type", the type of the tag(college,
company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imotov@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure, I
can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan vineet...@algotree.comwrote:

faceted search is what you are looking for.
Look into facets in - http://www.elasticsearch.org/**
guide/reference/api/search/**facets/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.d...@gmail.comwrote:

There will be user document which will contain the user_id, and other
user related relevant information. And there will be a tag document which
will have fields as (tag_type and tag_value). tag_type can be company,
college, etc. while tag_value should be the name. The id of the tag
document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

If you will store values in designated fields, you can simply run facets on
these fields. So, each field (company, college, ...) will have it's own
facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that, while
computing the facets against "tag_value", the returned values only contain
the facet count and its corresponding "tag_value". In the above result, I
also want the corresponding "tag_type", the type of the tag(college,
company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov <imo...@gmail.com<javascript:>

wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure, I
can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan vineet...@algotree.comwrote:

faceted search is what you are looking for.
Look into facets in - http://www.elasticsearch.org/**
guide/reference/api/search/**facets/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.d...@gmail.comwrote:

There will be user document which will contain the user_id, and other
user related relevant information. And there will be a tag document which
will have fields as (tag_type and tag_value). tag_type can be company,
college, etc. while tag_value should be the name. The id of the tag
document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

Yeah, that makes sense. I will try it soon and see if it meets my needs.
Thanks a lot!

On Tue, Jan 1, 2013 at 8:38 PM, Igor Motov imotov@gmail.com wrote:

If you will store values in designated fields, you can simply run facets
on these fields. So, each field (company, college, ...) will have it's own
facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that, while
computing the facets against "tag_value", the returned values only contain
the facet count and its corresponding "tag_value". In the above result, I
also want the corresponding "tag_type", the type of the tag(college,
company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imo...@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure, I
can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan vineet...@algotree.comwrote:

faceted search is what you are looking for.
Look into facets in - Elasticsearch Platform — Find real-time answers at scale | Elastic**
uide/reference/api/search/facets/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.d...@gmail.comwrote:

There will be user document which will contain the user_id, and other
user related relevant information. And there will be a tag document which
will have fields as (tag_type and tag_value). tag_type can be company,
college, etc. while tag_value should be the name. The id of the tag
document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

Hi,
The solution suggested above works fine. I have only one question. Since, I
am using global suggestion, I want the results to be sorted out on basis of
its relevance. As I am using multiple terms in the facets to get the
results, is there a way through which I can globally sort the results of
all the facet_terms according to its relevance and get one global results
array?

On Tue, Jan 1, 2013 at 11:08 PM, aash dhariya aash.discover@gmail.comwrote:

Yeah, that makes sense. I will try it soon and see if it meets my needs.
Thanks a lot!

On Tue, Jan 1, 2013 at 8:38 PM, Igor Motov imotov@gmail.com wrote:

If you will store values in designated fields, you can simply run facets
on these fields. So, each field (company, college, ...) will have it's own
facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that,
while computing the facets against "tag_value", the returned values only
contain the facet count and its corresponding "tag_value". In the above
result, I also want the corresponding "tag_type", the type of the
tag(college, company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imo...@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure, I
can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan <vineet...@algotree.com

wrote:

faceted search is what you are looking for.
Look into facets in - Elasticsearch Platform — Find real-time answers at scale | Elastic**
uide/reference/api/search/facets/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.d...@gmail.comwrote:

There will be user document which will contain the user_id, and
other user related relevant information. And there will be a tag document
which will have fields as (tag_type and tag_value). tag_type can be
company, college, etc. while tag_value should be the name. The id of the
tag document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--
Thanks,
Aash

--

Facets cannot be sorted by relevance. They can be sorted by the number of
records they show up in. Assuming that this is what you meant by relevance,
there are a few ways to create a single combined list of facets. First of
all, you can combine facets together on the client, just merge all facets
lists together and sort them by count. You can also run facets against
multiple fields (See Multi Field section on the Terms Facethttp://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.htmlpage), but this is, probably, not what you want since you will not know
which field each value came from. Alternatively, you can go back to your
original setup and create a facet on a script field that would return
something like (tag_type + "/" + tag_value), but it might be quite slow. A
faster variation of this solution would be to store (tag_type + "/" +
tag_value) as a special not-analyzed field and run facets on it, but
because this field can potentially have very large number of values per
record, this solution can require a lot of memory.

On Wednesday, January 2, 2013 2:52:35 AM UTC-5, geeky_sh wrote:

Hi,
The solution suggested above works fine. I have only one question. Since,
I am using global suggestion, I want the results to be sorted out on basis
of its relevance. As I am using multiple terms in the facets to get the
results, is there a way through which I can globally sort the results of
all the facet_terms according to its relevance and get one global results
array?

On Tue, Jan 1, 2013 at 11:08 PM, aash dhariya <aash.d...@gmail.com<javascript:>

wrote:

Yeah, that makes sense. I will try it soon and see if it meets my needs.
Thanks a lot!

On Tue, Jan 1, 2013 at 8:38 PM, Igor Motov <imo...@gmail.com<javascript:>

wrote:

If you will store values in designated fields, you can simply run facets
on these fields. So, each field (company, college, ...) will have it's own
facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that,
while computing the facets against "tag_value", the returned values only
contain the facet count and its corresponding "tag_value". In the above
result, I also want the corresponding "tag_type", the type of the
tag(college, company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imo...@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure,
I can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan <
vineet...@algotree.com> wrote:

faceted search is what you are looking for.
Look into facets in - Elasticsearch Platform — Find real-time answers at scale | Elastic**
uide/reference/api/search/facets/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya aash.d...@gmail.comwrote:

There will be user document which will contain the user_id, and
other user related relevant information. And there will be a tag document
which will have fields as (tag_type and tag_value). tag_type can be
company, college, etc. while tag_value should be the name. The id of the
tag document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--
Thanks,
Aash

--

Thanks a lot for your reply. I will try it out.

On Wed, Jan 2, 2013 at 6:13 PM, Igor Motov imotov@gmail.com wrote:

Facets cannot be sorted by relevance. They can be sorted by the number of
records they show up in. Assuming that this is what you meant by relevance,
there are a few ways to create a single combined list of facets. First of
all, you can combine facets together on the client, just merge all facets
lists together and sort them by count. You can also run facets against
multiple fields (See Multi Field section on the Terms Facethttp://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.htmlpage), but this is, probably, not what you want since you will not know
which field each value came from. Alternatively, you can go back to your
original setup and create a facet on a script field that would return
something like (tag_type + "/" + tag_value), but it might be quite slow. A
faster variation of this solution would be to store (tag_type + "/" +
tag_value) as a special not-analyzed field and run facets on it, but
because this field can potentially have very large number of values per
record, this solution can require a lot of memory.

On Wednesday, January 2, 2013 2:52:35 AM UTC-5, geeky_sh wrote:

Hi,
The solution suggested above works fine. I have only one question. Since,
I am using global suggestion, I want the results to be sorted out on basis
of its relevance. As I am using multiple terms in the facets to get the
results, is there a way through which I can globally sort the results of
all the facet_terms according to its relevance and get one global results
array?

On Tue, Jan 1, 2013 at 11:08 PM, aash dhariya aash.d...@gmail.comwrote:

Yeah, that makes sense. I will try it soon and see if it meets my needs.
Thanks a lot!

On Tue, Jan 1, 2013 at 8:38 PM, Igor Motov imo...@gmail.com wrote:

If you will store values in designated fields, you can simply run
facets on these fields. So, each field (company, college, ...) will have
it's own facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that,
while computing the facets against "tag_value", the returned values only
contain the facet count and its corresponding "tag_value". In the above
result, I also want the corresponding "tag_type", the type of the
tag(college, company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imo...@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above structure,
I can query elasticsearch and get relevant results? I am sorry for the
inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan <
vineet...@algotree.com> wrote:

faceted search is what you are looking for.
Look into facets in - Elasticsearch Platform — Find real-time answers at scale | Elastic****
uide/reference/api/search/**face****ts/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya <aash.d...@gmail.com

wrote:

There will be user document which will contain the user_id, and
other user related relevant information. And there will be a tag document
which will have fields as (tag_type and tag_value). tag_type can be
company, college, etc. while tag_value should be the name. The id of the
tag document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--
Thanks,
Aash

--

--
Thanks,
Aash

--

Well, relevance based search is important in my problem. Therefore, I have
decided to move forward with the following structure:

There will be a user document which will contain tags as nested documents.
Next, I will have the same tag documents indexed separately.

To trigger search when the user types the input, I will first search in the
user documents based on the tag_value(using facets). In return I will get
the matching tag_ids. Next, I will search the tag documents (which are
separately indexed) and get the corresponding tag_type and tag_value. In
the latter search, I will search against the user-typed input again, so
that I will get the output sorted by relevance.

In total, to get the final results, I need to make two searches in the ES
documents

On Thu, Jan 3, 2013 at 5:47 PM, aash dhariya aash.discover@gmail.comwrote:

Thanks a lot for your reply. I will try it out.

On Wed, Jan 2, 2013 at 6:13 PM, Igor Motov imotov@gmail.com wrote:

Facets cannot be sorted by relevance. They can be sorted by the number of
records they show up in. Assuming that this is what you meant by relevance,
there are a few ways to create a single combined list of facets. First of
all, you can combine facets together on the client, just merge all facets
lists together and sort them by count. You can also run facets against
multiple fields (See Multi Field section on the Terms Facethttp://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.htmlpage), but this is, probably, not what you want since you will not know
which field each value came from. Alternatively, you can go back to your
original setup and create a facet on a script field that would return
something like (tag_type + "/" + tag_value), but it might be quite slow. A
faster variation of this solution would be to store (tag_type + "/" +
tag_value) as a special not-analyzed field and run facets on it, but
because this field can potentially have very large number of values per
record, this solution can require a lot of memory.

On Wednesday, January 2, 2013 2:52:35 AM UTC-5, geeky_sh wrote:

Hi,
The solution suggested above works fine. I have only one question.
Since, I am using global suggestion, I want the results to be sorted out on
basis of its relevance. As I am using multiple terms in the facets to get
the results, is there a way through which I can globally sort the results
of all the facet_terms according to its relevance and get one global
results array?

On Tue, Jan 1, 2013 at 11:08 PM, aash dhariya aash.d...@gmail.comwrote:

Yeah, that makes sense. I will try it soon and see if it meets my
needs. Thanks a lot!

On Tue, Jan 1, 2013 at 8:38 PM, Igor Motov imo...@gmail.com wrote:

If you will store values in designated fields, you can simply run
facets on these fields. So, each field (company, college, ...) will have
it's own facet.

On Tuesday, January 1, 2013 3:41:26 AM UTC-5, geeky_sh wrote:

I guess I could do that too.
In the structure that I mentioned, the problem I am facing is that,
while computing the facets against "tag_value", the returned values only
contain the facet count and its corresponding "tag_value". In the above
result, I also want the corresponding "tag_type", the type of the
tag(college, company, etc.) my result is associated with. How do I do that?

On Mon, Dec 31, 2012 at 7:15 PM, Igor Motov imo...@gmail.com wrote:

Why cannot you simplify things a bit and just add company, college,
address, email, etc as user fields?

On Saturday, December 29, 2012 7:06:17 AM UTC-5, geeky_sh wrote:

Can you please give me an example whereby using the above
structure, I can query elasticsearch and get relevant results? I am sorry
for the inconvenience, but I am completely new to elasticsearch.
Thanks a lot.

On Sat, Dec 29, 2012 at 4:06 PM, Vineeth Mohan <
vineet...@algotree.com> wrote:

faceted search is what you are looking for.
Look into facets in - Elasticsearch Platform — Find real-time answers at scale | Elastic****
uide/reference/api/search/**face****ts/index.htmlhttp://www.elasticsearch.org/guide/reference/api/search/facets/index.html

On Sat, Dec 29, 2012 at 11:58 AM, aash dhariya <
aash.d...@gmail.com> wrote:

There will be user document which will contain the user_id, and
other user related relevant information. And there will be a tag document
which will have fields as (tag_type and tag_value). tag_type can be
company, college, etc. while tag_value should be the name. The id of the
tag document will the generated from its tag_value and tag_type.

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--

--
Thanks,
Aash

--
Thanks,
Aash

--

--
Thanks,
Aash

--
Thanks,
Aash

--