Aggregation with conditions


(lvalbuena) #1

Hi,

I have 2 cases.

Given the structure
{
email:value,
points:value
}

Case 1:
I have 1000 rows, where multiple rows can have the same value for the
email field.
{"email":"some@email.com","points":5}
{"email":"some@email.com","points":2}
...

How do I tell elasticsearch to search for all emails that have only
appeared once in the data set.

Case 2:
Also using aggregation. How can I tell elasticsearch to get all possible
occurrences the emails appeared in the data set.
ex.
emails = 5, occourances >= 5 // There are 5 emails that appeared 4 times in
the dataset
emails = 6, occourances = 4
emails = 23, occourances = 3
emails = 2, occourances = 2
emails = 12, occourances = 1

Or is it even posible?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/240d054f-131c-4904-81d6-95b7982f2f6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Binh Ly-2) #2

The first one is not available, however a terms aggregation and sort by
_count asc will bubble up the least frequent terms (emails) and you can
filter yourself which ones you want. The second one sounds like a simple
terms aggregation on the email field (just make sure the email field is
not_analyzed):

{
"aggs": {
"group_by_email": {
"terms": {
"field": "email"
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3c2f9f63-9a05-4bd5-beda-093f162b48e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(lvalbuena) #3

On Tuesday, April 1, 2014 6:17:30 PM UTC+8, lval...@egg.ph wrote:

Hi,

I have 2 cases.

Given the structure
{
email:value,
points:value
}

Case 1:
I have 1000 rows, where multiple rows can have the same value for the
email field.
{"email":"some@email.com","points":5}
{"email":"some@email.com","points":2}
...

How do I tell elasticsearch to search for all emails that have only
appeared once in the data set.

Case 2:
Also using aggregation. How can I tell elasticsearch to get all possible
occurrences the emails appeared in the data set.
ex.
emails = 5, occourances >= 5 // There are 5 emails that appeared 5 times
or greater in the dataset
emails = 6, occourances = 4
emails = 23, occourances = 3
emails = 2, occourances = 2
emails = 12, occourances = 1

Or is it even posible?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70ad3a1d-283f-4828-b241-b151432d4957%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4