Facets as a sub query


(wgerber) #1

Hello. I am new to EalsticSearch and really loving it. I have a lot of the basic queries working, but I am having difficulty figuring out how to use a facet as a sub query. Any guidance or even a link to a good piece of documentation that I missed would be very welcomed.

Here is an example:

I am indexing Folders and Files. Each can have "type" (folder or file), "name", and "description". If it is a folder, it can have "children" which themselves would be folders or files.

So, I can do a facet query to count up all of the children:
{
"facets":{
"childrenCount":{
"terms":{
"field":"children._type"
}
}
}
}

And the results will contain:
....
* terms: [
{
term: folder
count: 41
}
{
term: file
count: 63
}
]
}
.........

That is all fine and dandy. But here is what I am trying to do: I want to be able to say something like "Give me all the folders with 'blue' in the name and count all of the children for EACH folder".
The result set that I would love to have would then have like 5 folders listed and for each hit, it would give a separate count of children. (for now, ignore children folders have children of their own)

So, is this possible? If so, please shed some light for me. Thank you!

William


(Shay Banon) #2

There are several ways to make facets run on a more restricted values, the most common one is using the facet_filter element that can be placed on any facet definition. See here: http://www.elasticsearch.org/guide/reference/api/search/facets/ (at the bottom).

Maybe, filters set on search requests might fit better: http://www.elasticsearch.org/guide/reference/api/search/filter.html.

-shay.banon
On Monday, May 23, 2011 at 10:38 PM, wgerber wrote:

Hello. I am new to EalsticSearch and really loving it. I have a lot of the
basic queries working, but I am having difficulty figuring out how to use a
facet as a sub query. Any guidance or even a link to a good piece of
documentation that I missed would be very welcomed.

Here is an example:

I am indexing Folders and Files. Each can have "type" (folder or file),
"name", and "description". If it is a folder, it can have "children" which
themselves would be folders or files.

So, I can do a facet query to count up all of the children:
{
"facets":{
"childrenCount":{
"terms":{
"field":"children._type"
}
}
}
}

And the results will contain:
....

  • terms: [
    {
    term: folder
    count: 41
    }
    {
    term: file
    count: 63
    }
    ]
    }
    .........

That is all fine and dandy. But here is what I am trying to do: I want to
be able to say something like "Give me all the folders with 'blue' in the
name and count all of the children for EACH folder".
The result set that I would love to have would then have like 5 folders
listed and for each hit, it would give a separate count of children. (for
now, ignore children folders have children of their own)

So, is this possible? If so, please shed some light for me. Thank you!

William

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Facets-as-a-sub-query-tp2976668p2976668.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(wgerber) #3

Shay,
Hello there. Thanks for responding so quickly. I have actually tried
both of the links you provided before, but they aren't quite what I am
looking for. Unless there is a different way to use them then I am
seeing. Let me give an example result set that I would love to get
back.

I query for all folders that have "blue" in the name and ask it to do
a facet count of the children for each hit in the result set. I get
back the following:

Hit 1:
name = Blue Swan
description = .....
Facet count of children:
type = folder count = 3
type = file count = 3

Hit 2:
name = Blue Dragon
description = .....
Facet count of children:
type = folder count = 13
type = file count = 1

Hit 3:
name = Blue Cobalt
description = .....
Facet count of children:
type = file count = 3

Is this possible in one request or will I have to do the Facet counts
one at a time for each folder? Thank you for your help and patience
with a beginner!

On May 23, 6:06 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

There are several ways to make facets run on a more restricted values, the most common one is using the facet_filter element that can be placed on any facet definition. See here:http://www.elasticsearch.org/guide/reference/api/search/facets/(at the bottom).

Maybe, filters set on search requests might fit better:http://www.elasticsearch.org/guide/reference/api/search/filter.html.

-shay.banon

On Monday, May 23, 2011 at 10:38 PM, wgerber wrote:

Hello. I am new to EalsticSearch and really loving it. I have a lot of the
basic queries working, but I am having difficulty figuring out how to use a
facet as a sub query. Any guidance or even a link to a good piece of
documentation that I missed would be very welcomed.

Here is an example:

I am indexing Folders and Files. Each can have "type" (folder or file),
"name", and "description". If it is a folder, it can have "children" which
themselves would be folders or files.

So, I can do a facet query to count up all of the children:
{
"facets":{
"childrenCount":{
"terms":{
"field":"children._type"
}
}
}
}

And the results will contain:
....

  • terms: [
    {
    term: folder
    count: 41
    }
    {
    term: file
    count: 63
    }
    ]
    }
    .........

That is all fine and dandy. But here is what I am trying to do: I want to
be able to say something like "Give me all the folders with 'blue' in the
name and count all of the children for EACH folder".
The result set that I would love to have would then have like 5 folders
listed and for each hit, it would give a separate count of children. (for
now, ignore children folders have children of their own)

So, is this possible? If so, please shed some light for me. Thank you!

William

--
View this message in context:http://elasticsearch-users.115913.n3.nabble.com/Facets-as-a-sub-query...
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(plaflamme) #4

Hi,

This is a "group by" query which I don't think is directly supported by ES.
You can make n+1 queries though: 1 for getting your hits and then n to get
the counts. Obviously, this is a brute-force approach and is not going to
scale. You can forget about sorting on the counts too...

Hope it helps,
Philippe

On Tue, May 24, 2011 at 09:21, William wgerber@coralnetworks.com wrote:

Shay,
Hello there. Thanks for responding so quickly. I have actually tried
both of the links you provided before, but they aren't quite what I am
looking for. Unless there is a different way to use them then I am
seeing. Let me give an example result set that I would love to get
back.

I query for all folders that have "blue" in the name and ask it to do
a facet count of the children for each hit in the result set. I get
back the following:

Hit 1:
name = Blue Swan
description = .....
Facet count of children:
type = folder count = 3
type = file count = 3

Hit 2:
name = Blue Dragon
description = .....
Facet count of children:
type = folder count = 13
type = file count = 1

Hit 3:
name = Blue Cobalt
description = .....
Facet count of children:
type = file count = 3

Is this possible in one request or will I have to do the Facet counts
one at a time for each folder? Thank you for your help and patience
with a beginner!

On May 23, 6:06 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

There are several ways to make facets run on a more restricted values,
the most common one is using the facet_filter element that can be placed on
any facet definition. See here:
http://www.elasticsearch.org/guide/reference/api/search/facets/(at the
bottom).

Maybe, filters set on search requests might fit better:
http://www.elasticsearch.org/guide/reference/api/search/filter.html.

-shay.banon

On Monday, May 23, 2011 at 10:38 PM, wgerber wrote:

Hello. I am new to EalsticSearch and really loving it. I have a lot of
the

basic queries working, but I am having difficulty figuring out how to
use a

facet as a sub query. Any guidance or even a link to a good piece of
documentation that I missed would be very welcomed.

Here is an example:

I am indexing Folders and Files. Each can have "type" (folder or file),
"name", and "description". If it is a folder, it can have "children"
which

themselves would be folders or files.

So, I can do a facet query to count up all of the children:
{
"facets":{
"childrenCount":{
"terms":{
"field":"children._type"
}
}
}
}

And the results will contain:
....

  • terms: [
    {
    term: folder
    count: 41
    }
    {
    term: file
    count: 63
    }
    ]
    }
    .........

That is all fine and dandy. But here is what I am trying to do: I want
to

be able to say something like "Give me all the folders with 'blue' in
the

name and count all of the children for EACH folder".
The result set that I would love to have would then have like 5 folders
listed and for each hit, it would give a separate count of children.
(for

now, ignore children folders have children of their own)

So, is this possible? If so, please shed some light for me. Thank you!

William

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Facets-as-a-sub-query...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(wgerber) #5

Thanks for your help Philippe. I took a look and it seems like the
"group by" was just added to Lucene a week and a half ago. It will
probably take a bit to add it to ES and so I am stuck with the brute
force method for now. Not the end of the world. At least I now know
that I am not missing anything obvious in doing it the brute force
way.

William

On May 24, 8:35 am, Philippe Laflamme philippe.lafla...@obiba.org
wrote:

Hi,

This is a "group by" query which I don't think is directly supported by ES.
You can make n+1 queries though: 1 for getting your hits and then n to get
the counts. Obviously, this is a brute-force approach and is not going to
scale. You can forget about sorting on the counts too...

Hope it helps,
Philippe

On Tue, May 24, 2011 at 09:21, William wger...@coralnetworks.com wrote:

Shay,
Hello there. Thanks for responding so quickly. I have actually tried
both of the links you provided before, but they aren't quite what I am
looking for. Unless there is a different way to use them then I am
seeing. Let me give an example result set that I would love to get
back.

I query for all folders that have "blue" in the name and ask it to do
a facet count of the children for each hit in the result set. I get
back the following:

Hit 1:
name = Blue Swan
description = .....
Facet count of children:
type = folder count = 3
type = file count = 3

Hit 2:
name = Blue Dragon
description = .....
Facet count of children:
type = folder count = 13
type = file count = 1

Hit 3:
name = Blue Cobalt
description = .....
Facet count of children:
type = file count = 3

Is this possible in one request or will I have to do the Facet counts
one at a time for each folder? Thank you for your help and patience
with a beginner!

On May 23, 6:06 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

There are several ways to make facets run on a more restricted values,
the most common one is using the facet_filter element that can be placed on
any facet definition. See here:
http://www.elasticsearch.org/guide/reference/api/search/facets/(atthe
bottom).

Maybe, filters set on search requests might fit better:
http://www.elasticsearch.org/guide/reference/api/search/filter.html.

-shay.banon

On Monday, May 23, 2011 at 10:38 PM, wgerber wrote:

Hello. I am new to EalsticSearch and really loving it. I have a lot of
the

basic queries working, but I am having difficulty figuring out how to
use a

facet as a sub query. Any guidance or even a link to a good piece of
documentation that I missed would be very welcomed.

Here is an example:

I am indexing Folders and Files. Each can have "type" (folder or file),
"name", and "description". If it is a folder, it can have "children"
which

themselves would be folders or files.

So, I can do a facet query to count up all of the children:
{
"facets":{
"childrenCount":{
"terms":{
"field":"children._type"
}
}
}
}

And the results will contain:
....

  • terms: [
    {
    term: folder
    count: 41
    }
    {
    term: file
    count: 63
    }
    ]
    }
    .........

That is all fine and dandy. But here is what I am trying to do: I want
to

be able to say something like "Give me all the folders with 'blue' in
the

name and count all of the children for EACH folder".
The result set that I would love to have would then have like 5 folders
listed and for each hit, it would give a separate count of children.
(for

now, ignore children folders have children of their own)

So, is this possible? If so, please shed some light for me. Thank you!

William

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Facets-as-a-sub-query...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #6