Aggregating by hour


(Jenny Blunt) #1

I was sort of expecting the following to give me an aggregation which
groups the results only by hour:

curl http://localhost:9000/stream/_search -d '{
"aggs" : {
"visitor_count" : { "date_histogram" : { "field" : "created_at", "interval" : "hour"} }
}
}'

As it stands, it does group by hour, but it's also grouped by day. (I end
up with 24 results for each day I have data).

I understand this is correct however, I would like to understand how it
possible to group this only by the hour so I have 24 results only?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cd84375c-2fbc-48c3-b1cd-79f04e89d6a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Gabe Gorelick-Feldman) #2

I think you want something like a histogram with a value script to decide
the bucket. But it looks like histogram doesn't support that, so would a
range agg work? Otherwise, it might be easiest to store the hour in
addition to the timestamp.

On Tuesday, July 8, 2014 4:06:02 AM UTC-4, Jenny Blunt wrote:

I was sort of expecting the following to give me an aggregation which
groups the results only by hour:

curl http://localhost:9000/stream/_search -d '{
"aggs" : {
"visitor_count" : { "date_histogram" : { "field" : "created_at", "interval" : "hour"} }
}
}'

As it stands, it does group by hour, but it's also grouped by day. (I end
up with 24 results for each day I have data).

I understand this is correct however, I would like to understand how it
possible to group this only by the hour so I have 24 results only?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/acaa2d16-8ef1-4dd9-a9e1-7c48cd9feb53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Antonio Augusto Santos) #3

You can use The histogram aggregate and use a script with something like document[@timestamp].hour

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/93c6d39c-f5b4-449e-bf6e-f28fa05407e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jenny Blunt) #4

Sort of trying to stay away from scripting after we ran out of juice
recently. Seems to take a reasonably large amount of memory for each run?

We're using the for about 100million records.

In the end, I added an hour and day field to Mongo when processing the raw
data. That way we can use a really simple terms aggregation with a filter.

Will have a look at the 'document[@timestamp].hour' idea though and see
what it's like

Cheers!

On Wednesday, 9 July 2014 02:13:57 UTC+1, Antonio Augusto Santos wrote:

You can use The histogram aggregate and use a script with something like
document[@timestamp].hour

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/393c5449-9193-4f2b-a633-c73c1636d683%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5