Aggregate multiple value fields separately in terms_stats facet?


(Hari Shankar) #1

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it
seems to add up all the value fields when calculating the statistics. Can I
instead make it compute the stats separately for each field that I specify
in the value_field? That is, do the grouping on the basis of the key_field
as usual, but give me the statistics for multiple fields. Is that possible
in the current version (0.16)?

Thanks,
Hari


(Shay Banon) #2

You can simply use multiple terms stats facets, one per value field (with the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it seems to add up all the value fields when calculating the statistics. Can I instead make it compute the stats separately for each field that I specify in the value_field? That is, do the grouping on the basis of the key_field as usual, but give me the statistics for multiple fields. Is that possible in the current version (0.16)?

Thanks,
Hari


(Hari Shankar) #3

Yep I thought of that but I would like to maintain order based on one
particular field. e.g, I want to top 20 terms based on field 1, but for
those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set
the order to "total", then I have to do a second query with a filter for
these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon shay.banon@elasticsearch.comwrote:

You can simply use multiple terms stats facets, one per value field (with
the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it
seems to add up all the value fields when calculating the statistics. Can I
instead make it compute the stats separately for each field that I specify
in the value_field? That is, do the grouping on the basis of the key_field
as usual, but give me the statistics for multiple fields. Is that possible
in the current version (0.16)?

Thanks,
Hari


(Shay Banon) #4

ok, might still be missing something, but I still don't see why you can't use several filters (in the same request) under different names (on the second search request). One facet will be on the key field an field2, the second will be on key field and field3.

On Friday, July 8, 2011 at 9:47 AM, Hari Shankar wrote:

Yep I thought of that but I would like to maintain order based on one particular field. e.g, I want to top 20 terms based on field 1, but for those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set the order to "total", then I have to do a second query with a filter for these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

You can simply use multiple terms stats facets, one per value field (with the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it seems to add up all the value fields when calculating the statistics. Can I instead make it compute the stats separately for each field that I specify in the value_field? That is, do the grouping on the basis of the key_field as usual, but give me the statistics for multiple fields. Is that possible in the current version (0.16)?

Thanks,
Hari


(Hari Shankar) #5

I am unable to maintain order when I use multiple facets. e.g, Lets say I
have this data:
Customer Clicks Logins
C1 3 4
C1 4 4
C1 5 2
C2 2 5
C2 2 8
C2 2 1

Now, let us say I do a terms_stats facet on this, with key_field as Customer
and value field as Clicks, and order is set to "total". So I will get back
this:
C1 --> 12 clicks
C2 --> 6 clicks

Now if I add another facet for Logins, if I order by total, I will get
C2 --> 14 logins
C1 --> 10 logins

What I would like to do is maintain order between these two. That is, I want
data ordered by total clicks:
C1 --> 12 clicks, 10 logins
C2 --> 6 clicks, 14 logins

or I could have them ordered by logins, in which case it would be:
C2 --> 14 logins, 6 clicks
C1 -0> 10 logins, 12 clicks

Is that what you were thinking? Is that somehow doable currently?

On Fri, Jul 8, 2011 at 8:32 PM, Shay Banon shay.banon@elasticsearch.comwrote:

ok, might still be missing something, but I still don't see why you can't
use several filters (in the same request) under different names (on the
second search request). One facet will be on the key field an field2, the
second will be on key field and field3.

On Friday, July 8, 2011 at 9:47 AM, Hari Shankar wrote:

Yep I thought of that but I would like to maintain order based on one
particular field. e.g, I want to top 20 terms based on field 1, but for
those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set
the order to "total", then I have to do a second query with a filter for
these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon shay.banon@elasticsearch.comwrote:

You can simply use multiple terms stats facets, one per value field (with
the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it
seems to add up all the value fields when calculating the statistics. Can I
instead make it compute the stats separately for each field that I specify
in the value_field? That is, do the grouping on the basis of the key_field
as usual, but give me the statistics for multiple fields. Is that possible
in the current version (0.16)?

Thanks,
Hari


(Shay Banon) #6

What you can do in this case is make two search calls. The first, gets Customer and Clicks. The second does a terms stats on Customer and Logins, but has a facet_filter with terms filter holding the customers you want to get the results back for.

If its a different order, first logins and then clicks, you can switch the order.

Also, make sure to use search_type set to count where possible, if you don't need the hits back.

On Friday, July 8, 2011 at 7:10 PM, Hari Shankar wrote:

I am unable to maintain order when I use multiple facets. e.g, Lets say I have this data:
Customer Clicks Logins
C1 3 4
C1 4 4
C1 5 2
C2 2 5
C2 2 8
C2 2 1

Now, let us say I do a terms_stats facet on this, with key_field as Customer and value field as Clicks, and order is set to "total". So I will get back this:
C1 --> 12 clicks
C2 --> 6 clicks

Now if I add another facet for Logins, if I order by total, I will get
C2 --> 14 logins
C1 --> 10 logins

What I would like to do is maintain order between these two. That is, I want data ordered by total clicks:
C1 --> 12 clicks, 10 logins
C2 --> 6 clicks, 14 logins

or I could have them ordered by logins, in which case it would be:
C2 --> 14 logins, 6 clicks
C1 -0> 10 logins, 12 clicks

Is that what you were thinking? Is that somehow doable currently?

On Fri, Jul 8, 2011 at 8:32 PM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

ok, might still be missing something, but I still don't see why you can't use several filters (in the same request) under different names (on the second search request). One facet will be on the key field an field2, the second will be on key field and field3.

On Friday, July 8, 2011 at 9:47 AM, Hari Shankar wrote:

Yep I thought of that but I would like to maintain order based on one particular field. e.g, I want to top 20 terms based on field 1, but for those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set the order to "total", then I have to do a second query with a filter for these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

You can simply use multiple terms stats facets, one per value field (with the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it seems to add up all the value fields when calculating the statistics. Can I instead make it compute the stats separately for each field that I specify in the value_field? That is, do the grouping on the basis of the key_field as usual, but give me the statistics for multiple fields. Is that possible in the current version (0.16)?

Thanks,
Hari


(Hari Shankar) #7

ok, that is the way we are doing it currently, but it would be faster if we
do that on es itself right? Thanks for the clarification Shay!

On Sat, Jul 9, 2011 at 2:28 AM, Shay Banon shay.banon@elasticsearch.comwrote:

What you can do in this case is make two search calls. The first, gets
Customer and Clicks. The second does a terms stats on Customer and Logins,
but has a facet_filter with terms filter holding the customers you want to
get the results back for.

If its a different order, first logins and then clicks, you can switch the
order.

Also, make sure to use search_type set to count where possible, if you
don't need the hits back.

On Friday, July 8, 2011 at 7:10 PM, Hari Shankar wrote:

I am unable to maintain order when I use multiple facets. e.g, Lets say I
have this data:
Customer Clicks Logins
C1 3 4
C1 4 4
C1 5 2
C2 2 5
C2 2 8
C2 2 1

Now, let us say I do a terms_stats facet on this, with key_field as
Customer and value field as Clicks, and order is set to "total". So I will
get back this:
C1 --> 12 clicks
C2 --> 6 clicks

Now if I add another facet for Logins, if I order by total, I will get
C2 --> 14 logins
C1 --> 10 logins

What I would like to do is maintain order between these two. That is, I
want data ordered by total clicks:
C1 --> 12 clicks, 10 logins
C2 --> 6 clicks, 14 logins

or I could have them ordered by logins, in which case it would be:
C2 --> 14 logins, 6 clicks
C1 -0> 10 logins, 12 clicks

Is that what you were thinking? Is that somehow doable currently?

On Fri, Jul 8, 2011 at 8:32 PM, Shay Banon shay.banon@elasticsearch.comwrote:

ok, might still be missing something, but I still don't see why you can't
use several filters (in the same request) under different names (on the
second search request). One facet will be on the key field an field2, the
second will be on key field and field3.

On Friday, July 8, 2011 at 9:47 AM, Hari Shankar wrote:

Yep I thought of that but I would like to maintain order based on one
particular field. e.g, I want to top 20 terms based on field 1, but for
those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set
the order to "total", then I have to do a second query with a filter for
these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon shay.banon@elasticsearch.comwrote:

You can simply use multiple terms stats facets, one per value field (with
the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it
seems to add up all the value fields when calculating the statistics. Can I
instead make it compute the stats separately for each field that I specify
in the value_field? That is, do the grouping on the basis of the key_field
as usual, but give me the statistics for multiple fields. Is that possible
in the current version (0.16)?

Thanks,
Hari


(Shay Banon) #8

Actually, it depends on the type of data. It might be faster to do it by executing the query twice, and in other conditions, it might be faster to have a compound facet implementation for that in ES (which we don't have currently).

On Saturday, July 9, 2011 at 10:10 AM, Hari Shankar wrote:

ok, that is the way we are doing it currently, but it would be faster if we do that on es itself right? Thanks for the clarification Shay!

On Sat, Jul 9, 2011 at 2:28 AM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

What you can do in this case is make two search calls. The first, gets Customer and Clicks. The second does a terms stats on Customer and Logins, but has a facet_filter with terms filter holding the customers you want to get the results back for.

If its a different order, first logins and then clicks, you can switch the order.

Also, make sure to use search_type set to count where possible, if you don't need the hits back.

On Friday, July 8, 2011 at 7:10 PM, Hari Shankar wrote:

I am unable to maintain order when I use multiple facets. e.g, Lets say I have this data:
Customer Clicks Logins
C1 3 4
C1 4 4
C1 5 2
C2 2 5
C2 2 8
C2 2 1

Now, let us say I do a terms_stats facet on this, with key_field as Customer and value field as Clicks, and order is set to "total". So I will get back this:
C1 --> 12 clicks
C2 --> 6 clicks

Now if I add another facet for Logins, if I order by total, I will get
C2 --> 14 logins
C1 --> 10 logins

What I would like to do is maintain order between these two. That is, I want data ordered by total clicks:
C1 --> 12 clicks, 10 logins
C2 --> 6 clicks, 14 logins

or I could have them ordered by logins, in which case it would be:
C2 --> 14 logins, 6 clicks
C1 -0> 10 logins, 12 clicks

Is that what you were thinking? Is that somehow doable currently?

On Fri, Jul 8, 2011 at 8:32 PM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

ok, might still be missing something, but I still don't see why you can't use several filters (in the same request) under different names (on the second search request). One facet will be on the key field an field2, the second will be on key field and field3.

On Friday, July 8, 2011 at 9:47 AM, Hari Shankar wrote:

Yep I thought of that but I would like to maintain order based on one particular field. e.g, I want to top 20 terms based on field 1, but for those 20 terms I want aggregations on field 2 and field 3 also.

Right now the way I do this is I do the aggregation based on field 1, I set the order to "total", then I have to do a second query with a filter for these terms that were returned in the first query.

On Fri, Jul 8, 2011 at 10:24 AM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

You can simply use multiple terms stats facets, one per value field (with the same key field).

On Friday, July 8, 2011 at 7:48 AM, Hari Shankar wrote:

Hi,

I tried giving multiple value_fields in the terms stats facet. But then it seems to add up all the value fields when calculating the statistics. Can I instead make it compute the stats separately for each field that I specify in the value_field? That is, do the grouping on the basis of the key_field as usual, but give me the statistics for multiple fields. Is that possible in the current version (0.16)?

Thanks,
Hari


(system) #9