Need help with aggregation and unique counted values


(Andreas Hembach) #1

Hi all,

i'm new here and have a problem with an query. I hope someone can help me.

My Problem:

  • I have a log with user clicks, the user revenue and there session id's.
    Now i want to build a data histogram with all counted clicks, the unqiue
    session ids and the user revenue.

My Query:
{
"query":{
"match_all":{}
},
"aggs":{
"log_over_time":{
"date_histogram":{
"field":"dateline",
"interval":"month",
"format":"yyyy-MM"
},
"aggs":{
"amount":{
"sum":{
"field":"order_amount"
}
},
"unique":{
"terms":{
"field":"user_session_id",
"size":100000
}
}
}
}
}
}

My first approach is to count the "unique" entries. But the response is
very very large and limited to 100000 entries.

Is there a better way to do this? Can i do something like group by value?

A big thank you for the help!

Greetings,
Andreas

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e4172503-ad5b-4bb0-9856-2fd3abb647b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #2

Hi,

The next version of Elasticsearch will have a new cardinality[1]
aggregation that allows for computing unique counts. You could use it in
lieu of the "unique" terms aggregation in order to compute the unique count
of session IDs.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

On Tue, Mar 18, 2014 at 2:42 PM, Andreas Hembach hembach3@gmail.com wrote:

Hi all,

i'm new here and have a problem with an query. I hope someone can help me.

My Problem:

  • I have a log with user clicks, the user revenue and there session id's.
    Now i want to build a data histogram with all counted clicks, the unqiue
    session ids and the user revenue.

My Query:
{
"query":{
"match_all":{}
},
"aggs":{
"log_over_time":{
"date_histogram":{
"field":"dateline",
"interval":"month",
"format":"yyyy-MM"
},
"aggs":{
"amount":{
"sum":{
"field":"order_amount"
}
},
"unique":{
"terms":{
"field":"user_session_id",
"size":100000
}
}
}
}
}
}

My first approach is to count the "unique" entries. But the response is
very very large and limited to 100000 entries.

Is there a better way to do this? Can i do something like group by value?

A big thank you for the help!

Greetings,
Andreas

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e4172503-ad5b-4bb0-9856-2fd3abb647b1%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/e4172503-ad5b-4bb0-9856-2fd3abb647b1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5GNDsv118QRrYMc1bsBPGLG9W-XmwuDVCWsVdFPWkoHg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3