Reduce-style aggregators


(mccraig mccraig) #1

i keep finding that i want to do more generic metric aggregations than are
currently available

e.g. to do a like behaviour so given docs:

{name: "foo", location: "london", revenue: 100}
{name: "foo", location: "paris", revenue: 500}
{name: "bar", location: "sydney", revenue: 15}
{name: "bar", location: "new york", revenue: 23}

i can get a result like this (collect the locations for a name, and sum the
revenues) :

{name: "foo", location: ["london","paris"], revenue: 600}
{name: "bar", location: ["sydney", "new york"], revenue: 38}

i.e. the kinds of things you can do with sql group-by, postgresql window
functions or cascading/cascalog aggregators
[ https://github.com/nathanmarz/cascalog/wiki/Guide-to-custom-operations#wiki-aggregators
]

it seems that the es1.0 aggregations framework could straightforwardly
support this type of aggregation operations as e.g. a reduce of the
documents in a bucket

are there any such reduce-style aggregation operations on the roadmap ?

:c

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9f361d38-b9de-4a7f-bb4e-244b3be6743b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(James Cook-2) #2

There seems to be some reluctance by ES team to provide scrip table aggregators, or perhaps it's on a roadmap and just taking a long time. Kimchi has stated that he would like to identify these use cases and roll them into built-in aggregations so everyone can benefit. I think the range of these use cases is too broad for a specific set of implementations. The aggregations types included are fine for simple stats and bucketing, but there are probably hundreds of scenarios that require custom aggregation.

I'll probably look into creating a custom aggregator for my use case.

Note: a lot of the custom aggregators mentioned in this forum are general implementations of the reduce clause in a map reduce statement.

My use case is the implementation of an Item Response Theory algorithm on a filtered result set. The idea is that a list of student responses to a question can be incrementally processed to result in a proficiency value. The filter (map) results are restricted to a timeframe and sorted, and the aggregation (reduce) step will incrementally inject each students score into an algorithm that progressively converges on the student's ability magnitude to answer those questions.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b8725e6-c73b-4da7-abeb-330c1e0d2406%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Matt Weber) #3

You will be able to do this soon. See:

Thanks,
Matt Weber
On Aug 9, 2014 10:44 AM, "James Cook" djamescook@gmail.com wrote:

There seems to be some reluctance by ES team to provide scrip table
aggregators, or perhaps it's on a roadmap and just taking a long time.
Kimchi has stated that he would like to identify these use cases and roll
them into built-in aggregations so everyone can benefit. I think the range
of these use cases is too broad for a specific set of implementations. The
aggregations types included are fine for simple stats and bucketing, but
there are probably hundreds of scenarios that require custom aggregation.

I'll probably look into creating a custom aggregator for my use case.

Note: a lot of the custom aggregators mentioned in this forum are general
implementations of the reduce clause in a map reduce statement.

My use case is the implementation of an Item Response Theory algorithm on
a filtered result set. The idea is that a list of student responses to a
question can be incrementally processed to result in a proficiency value.
The filter (map) results are restricted to a timeframe and sorted, and the
aggregation (reduce) step will incrementally inject each students score
into an algorithm that progressively converges on the student's ability
magnitude to answer those questions.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4b8725e6-c73b-4da7-abeb-330c1e0d2406%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAwv2m1-hLHGWnY6wF%3DDEQwDBqau_KN%2BMQTcgs%2BZooE0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4