Aggreagations and distinct

nicolas_maillard1 · December 3, 2013, 5:18am

Hello everyone

I'm playing around with the aggregations to get a better feel of what I can
don with them.
I was wondering how I would write the following aggregations/
I have entries listing user interactions and the ip at the time of
interaction.
Say I want to count for every user the number of different ips i have seen
for them.
Other question
I want to find the most seen ips for every user.

My initial attempt was:

{
"aggs" : {
"genders" : {
"terms" : {
"field" : "user_id"
},
"aggs" : {
"ips" : { "terms" : { "field" : "remoteip" } }
}
}
}
}

but that does not seem to be quite right

regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e56fdb8d-9173-45cc-8abf-884d20018727%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

nicolas_maillard1 · December 3, 2013, 8:55am

If this helps anyone I have found the query:
On a side it is relatively slow:
my dataset is about 2,5 million docs on a single node with 15gb ram;
the query:

{
"aggs": {
"user": {
"terms": {
"field": "user_id"
},
"aggs": {
"popular_ips": {
"terms": {
"field": "remoteip"
}
}
}
}
}
}

On Tuesday, December 3, 2013 6:18:02 AM UTC+1, nicolas maillard wrote:

Hello everyone

I'm playing around with the aggregations to get a better feel of what I
can don with them.
I was wondering how I would write the following aggregations/
I have entries listing user interactions and the ip at the time of
interaction.
Say I want to count for every user the number of different ips i have seen
for them.
Other question
I want to find the most seen ips for every user.

My initial attempt was:

{
"aggs" : {
"genders" : {
"terms" : {
"field" : "user_id"
},
"aggs" : {
"ips" : { "terms" : { "field" : "remoteip" } }
}
}
}
}

but that does not seem to be quite right

regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/170c72ef-b3ef-44d4-8786-cbc8b1588b2e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jpountz · December 3, 2013, 9:34am

Hi Nicolas,

The aggregations framework present in Elasticsearch 1.0 beta 2 is still at
an early stage and doesn't have all the optimizations that facets have got
over their years of existence. For example, if you compare terms facets
against terms aggregations on string terms, you may notice that terms
aggregations are significantly slower. The reason is that aggregations
don't know yet how to leverage terms ordinals in order to speed up the
generation of the buckets: this is something that will be addressed in the
1.0 release. There are other similar improvements that are planned for the
next weeks and performance numbers should hopefully get better in the next
releases.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6Ykhp_YrfRqO-71qm4r0wQX2aqkufnpv1WnMrOSCtnMw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

nicolas_maillard1 · December 3, 2013, 9:43am

Thanks for the heads up adrien

definitly looking forward to this release. I'm testing out the usability on
some of our use cases and right now it is a little slow and very ogten
hitting the Ram limit even for this small table and a somwhat simple query.
none the less great feture and I am sure will event better by the time it
hits GA.
Thanks es for all the hard work

On Tuesday, December 3, 2013 10:34:05 AM UTC+1, Adrien Grand wrote:

Hi Nicolas,

The aggregations framework present in Elasticsearch 1.0 beta 2 is still at
an early stage and doesn't have all the optimizations that facets have got
over their years of existence. For example, if you compare terms facets
against terms aggregations on string terms, you may notice that terms
aggregations are significantly slower. The reason is that aggregations
don't know yet how to leverage terms ordinals in order to speed up the
generation of the buckets: this is something that will be addressed in the
1.0 release. There are other similar improvements that are planned for the
next weeks and performance numbers should hopefully get better in the next
releases.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bee0dca4-9dc1-438b-9f08-b9890073d661%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
After using term aggregation search api will not get the desired result. Tried everything ?Help me out Elasticsearch	3	359	July 6, 2017
Aggr count subjects, unique from addresses and ips Elasticsearch	2	396	July 26, 2018
Help with aggregation Elasticsearch	1	306	July 6, 2017
Aggregation of count of terms (possibly...) Elasticsearch	1	414	July 6, 2017
Cardinality Aggregation - Different Unique Counts! Elasticsearch	1	333	July 6, 2017

Aggreagations and distinct

Related topics