Calculate avg on serial data


(Yong Wang) #1

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(Marcin Dojwa) #2

Hi,

Check date_histogram facet described here:
http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet.html

Just give "interval":"minute"

Then you have to work out how to return avg not count for each minute but I
guess this can be done using "Script Value Field" (also described there).

Best regards.

2012/4/3 Wang Yong cnwangyong@gmail.com

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(Marcin Dojwa) #3

It's funny, I tried to help you and currently have the same problem :slight_smile:
Does anyone know how to get date histogram with average field value within
each date entry ? Standard date histogram allows to get count only but I do
not know how to get sum, avg, etc... If I have at least sum I could get avg
by myself.

Thanks for help.

Best regards.
Marcin

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

Hi,

Check date_histogram facet described here:
http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet.html

Just give "interval":"minute"

Then you have to work out how to return avg not count for each minute but
I guess this can be done using "Script Value Field" (also described there).

Best regards.

2012/4/3 Wang Yong cnwangyong@gmail.com

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(Marcin Dojwa) #4

I found the answer, Alan, you have to use Date Histogram with key_field
pointing to timestamp and value_field pointing value. Then you will get
mean (average) of value within the timestamp entry.

Best regards.

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

It's funny, I tried to help you and currently have the same problem :slight_smile:
Does anyone know how to get date histogram with average field value within
each date entry ? Standard date histogram allows to get count only but I do
not know how to get sum, avg, etc... If I have at least sum I could get avg
by myself.

Thanks for help.

Best regards.
Marcin

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

Hi,

Check date_histogram facet described here:
http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet.html

Just give "interval":"minute"

Then you have to work out how to return avg not count for each minute but
I guess this can be done using "Script Value Field" (also described there).

Best regards.

2012/4/3 Wang Yong cnwangyong@gmail.com

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(Marcin Dojwa) #5

I've just found out that 'value_field' must be mapped to
have "index":"yes". Am I correct ?

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

I found the answer, Alan, you have to use Date Histogram with key_field
pointing to timestamp and value_field pointing value. Then you will get
mean (average) of value within the timestamp entry.

Best regards.

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

It's funny, I tried to help you and currently have the same problem :slight_smile:
Does anyone know how to get date histogram with average field value
within each date entry ? Standard date histogram allows to get count only
but I do not know how to get sum, avg, etc... If I have at least sum I
could get avg by myself.

Thanks for help.

Best regards.
Marcin

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

Hi,

Check date_histogram facet described here:
http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet.html

Just give "interval":"minute"

Then you have to work out how to return avg not count for each minute
but I guess this can be done using "Script Value Field" (also described
there).

Best regards.

2012/4/3 Wang Yong cnwangyong@gmail.com

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(Shay Banon) #6

Yes, the value field needs ot be indexed as well.

On Tue, Apr 3, 2012 at 5:35 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

I've just found out that 'value_field' must be mapped to
have "index":"yes". Am I correct ?

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

I found the answer, Alan, you have to use Date Histogram with key_field
pointing to timestamp and value_field pointing value. Then you will get
mean (average) of value within the timestamp entry.

Best regards.

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

It's funny, I tried to help you and currently have the same problem :slight_smile:
Does anyone know how to get date histogram with average field value
within each date entry ? Standard date histogram allows to get count only
but I do not know how to get sum, avg, etc... If I have at least sum I
could get avg by myself.

Thanks for help.

Best regards.
Marcin

2012/4/3 Marcin Dojwa m.dojwa@livechatinc.com

Hi,

Check date_histogram facet described here:
http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet.html

Just give "interval":"minute"

Then you have to work out how to return avg not count for each minute
but I guess this can be done using "Script Value Field" (also described
there).

Best regards.

2012/4/3 Wang Yong cnwangyong@gmail.com

Hi!
I am using elasticsearch to store a serial values of a metric like
{timestamp, value}. Now, I am trying to calculate the average of this
metric of each minute in some time range. I did this by a filter to get all
docs in that minute and calculate the avg and then the next minute. So I
have to do a searching for every minute and the performance was terrible.
Is there any sugestion for a better algorithm for better performance? Thank
you for any comment!

Alan


(system) #7