Aggregations on nested array types


(dazraf) #1

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across a
set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/22d6273a-b6a9-4e7b-8364-9011bf34ef5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dazraf) #2

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across a
set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f780e18-66ea-4bee-bded-5b73632b532c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Harwood-2) #3

A rough Gist here that sums OK with one level of
nesting: https://gist.github.com/markharwood/9938890

On Wednesday, April 2, 2014 5:13:22 PM UTC+1, dazraf wrote:

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across a
set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9dec2ee8-5cc5-4243-827f-39d1010382c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dazraf) #4

Thanks very much Mark! I'll study this and respond back on this thread.

On Wednesday, 2 April 2014 18:31:29 UTC+1, Mark Harwood wrote:

A rough Gist here that sums OK with one level of nesting:
https://gist.github.com/markharwood/9938890

On Wednesday, April 2, 2014 5:13:22 PM UTC+1, dazraf wrote:

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across a
set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1bea9acb-4f36-40dd-a19a-65a1cb332939%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dazraf) #5

Hi Mark,

thanks again for helping with this.

I'm wondering why, in the solution, the mapping doesn't include the data
node in the tree.
In fact when I explicitly state the data node as a property of contestant,
the aggregations come back blank.Gist:

many thanks!

On Wednesday, 2 April 2014 18:31:29 UTC+1, Mark Harwood wrote:

A rough Gist here that sums OK with one level of nesting:
https://gist.github.com/markharwood/9938890

On Wednesday, April 2, 2014 5:13:22 PM UTC+1, dazraf wrote:

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across a
set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/084cea14-2cc1-4b94-9218-e87f5cba6bdf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dazraf) #6

I thought it may be useful to summarise what we're trying to do:

Let the document structure be:

contest:
contestant: // [array]
id: string
data: // [array]
run: integer // note that I've added this to highlight one of our
usecases
timing: integer
// ... other attributes

We don't have control of this structure. Clients of our service are free to
define it as they wish (hence, I guess we need to use dynamic templates to
define the mappings).
The structure may be *arbitrarily *deep, with nested array types.

The types of aggregations the client would like to make:

  1. find total of timing by contestant id (explained very nicely with your
    answer)
  2. find average of timing across all contestants by run. In this case the
    grouping is by "run"

I can see how to each of the above, albeit with different mappings. What's
required is one mapping a flexible way to do both (ideally with full Kibana
support).

thanks
Fuzz.

On Thursday, 3 April 2014 16:53:02 UTC+1, dazraf wrote:

Hi Mark,

thanks again for helping with this.

I'm wondering why, in the solution, the mapping doesn't include the data
node in the tree.
In fact when I explicitly state the data node as a property of contestant,
the aggregations come back blank.Gist:
https://gist.github.com/dazraf/9957039

many thanks!

On Wednesday, 2 April 2014 18:31:29 UTC+1, Mark Harwood wrote:

A rough Gist here that sums OK with one level of nesting:
https://gist.github.com/markharwood/9938890

On Wednesday, April 2, 2014 5:13:22 PM UTC+1, dazraf wrote:

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across
a set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bddbde7e-bcb7-4882-a67c-4c9508cca282%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dazraf) #7

I thought it may be useful to summarise what we're trying to do:

Let the document structure be:

contest:
contestant: // [array]
id: string
data: // [array]
run: integer // note that I've added this to highlight one of our
usecases
timing: integer
// ... other attributes

We don't have control of this structure. Clients of our service are free to
define it as they wish (hence, I guess we need to use dynamic templates to
define the mappings).
The structure may be *arbitrarily *deep, with nested array types.

Some example of the types of aggregations the client would like to make:

  1. find total of timing by contestant id (explained very nicely with your
    answer)
  2. find average of timing across all contestants by run. In this case the
    grouping is by "run"

I can see how to carry out each of the above, albeit with distinct
mappings. However, what's we need is one mapping that can do both queries
(ideally one that works with Kibana).

thanks
Fuzz.

On Thursday, 3 April 2014 16:53:02 UTC+1, dazraf wrote:

Hi Mark,

thanks again for helping with this.

I'm wondering why, in the solution, the mapping doesn't include the data
node in the tree.
In fact when I explicitly state the data node as a property of contestant,
the aggregations come back blank.Gist:
https://gist.github.com/dazraf/9957039

many thanks!

On Wednesday, 2 April 2014 18:31:29 UTC+1, Mark Harwood wrote:

A rough Gist here that sums OK with one level of nesting:
https://gist.github.com/markharwood/9938890

On Wednesday, April 2, 2014 5:13:22 PM UTC+1, dazraf wrote:

Hi,
I've also experimented with nested types using dynamic templates.
Interesting (empty!) aggregation results!
Gist: https://gist.github.com/dazraf/9937198

Would be grateful if anyone can shed some light on this please?

Thank you.

On Wednesday, 2 April 2014 16:05:00 UTC+1, dazraf wrote:

Hi,

Gist: https://gist.github.com/dazraf/9935814

Basically, I'd like to be able to aggregate a field of an array of
observations, grouped by an ancestor/parent id.
So for example (see gist): Aggregate the timings per contestant across
a set of contests.

I realise that the data can be structured differently - effectively
flattened to a document per contest-contestant-contest.
However, I don't have the luxury of doing this in the real-world case.

Any help much appreciated.

Thank you.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/863c4f2d-7cb1-4453-9ba1-2ba37bca7a34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #8