Is it possible to get a bucketed aggregation based on the count of values for a field?


(Mike) #1

For example, assume I have the following docs:

{user:"Mike"}
{user:"John"}
{user:"Mike"}
{user:"Sara"}
{user:"Sara"}
{user:"Sara"}

I can do a terms agg on user and get:
Sara: 3
Mike: 2
John: 1

What if I didn't care about the actual total number of terms per value, and
instead just wanted them bucketed into say 2 bins, those that had counts
<=1, and those >=2?
Users Showing up >= 2 Times: 2
Users Showing up < 2 Times: 1

The range agg seems to give me the flexibility that I want in creating
buckets, that that is based on the actual numeric value of a field like
score or price. Is there a way to do the above without iterating through
the thousands of terms myself on the client side?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4f36c603-eb3b-45a0-ad73-dd6f97a5a0fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(ElasticSearch Users mailing list) #2

I don't believe this is possible at the moment. If you can pre-process your
data and produce this summarization indexed into ES:

Sara: 3
Mike: 2
John: 1

Then you can use the range (or filter) agg as you already mentioned.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a6741478-ff3d-4b47-a39a-5cb7332759bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mike) #3

Thanks. User is just 1 field in my docs, which actually represent all
requests made to my system. What I would really like is to just get a
count of the "heavy", "medium", and "light" users, where heavy would be
users that have made say > 10 requests. I guess what I would need would be
something like a terms_range agg.

On Fri, May 30, 2014 at 2:09 PM, 'Binh Ly' via elasticsearch <
elasticsearch@googlegroups.com> wrote:

I don't believe this is possible at the moment. If you can pre-process
your data and produce this summarization indexed into ES:

Sara: 3
Mike: 2
John: 1

Then you can use the range (or filter) agg as you already mentioned.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/CtDhs0HDK2Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a6741478-ff3d-4b47-a39a-5cb7332759bd%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a6741478-ff3d-4b47-a39a-5cb7332759bd%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEhMYmMv5VxLFB0676QX4vpQW%3D%3DYFYk4Lzi2M2yrGMzCzet0Rg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4