Stats aggregation, return documents at min/max


(Dan Isla) #1

I'm trying to combine a date_histogram agg with a stats agg to find the
min/max within a bucket of documents, so far so good, but now I need a
field from the document where the Min/Max was actually found.

My index basically has millions of x,y datapoints that I want to search for
and plot.

What I want are the 2 documents from each bucket where the 'y' value was a
min and a max.

This query successfully gave me buckets divided by 300s intervals and the
min/max y values within those buckets. Problem is, I have no way of linking
those min/max values to their corresponding 'x' value for plotting.

{
"size": 0,
"aggs" : {
"vals" : {
"filter" : { "term" : { "component" : "data_to_plot" } },
"aggs" : {
"values_over_time" : {
"date_histogram" : {
"field" : "x",
"interval" : "300s"
},
"aggs": {
"stats_y": {"stats": {"field": "y"} },

                }
            }
        }
    }
}          

}

Any suggestions?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bfa7f7d9-6d8c-493d-bcea-5acae891ab07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Dan Isla) #2

I was able to solve my own problem using a terms sub-aggregation on the 'x'
values in the interval bucket.

{
"size": 0,
"aggs" : {
"vals": {
"filter": {"term" : { "component" : "data_to_plot" }},
"aggs": {
"values_over_time" : {
"date_histogram" : {
"field" : "time_seconds",
"interval" : "1500s"
},
"aggs": {
"time_y_min": {
"terms": {
"field": "time_seconds",
"order": {"y_min": "asc"},
"size": 1
},
"aggs": {
"y_min": {"min": {"field": "y" } }
}
},
"time_y_max": {
"terms": {
"field": "time_seconds",
"order": {"y_max": "desc"},
"size": 1
},
"aggs": {
"y_max": {"max": {"field": "y" } }
}
}
}
}
}
}
}
}

On Thursday, June 12, 2014 4:08:13 PM UTC-7, Dan Isla wrote:

I'm trying to combine a date_histogram agg with a stats agg to find the
min/max within a bucket of documents, so far so good, but now I need a
field from the document where the Min/Max was actually found.

My index basically has millions of x,y datapoints that I want to search
for and plot.

What I want are the 2 documents from each bucket where the 'y' value was a
min and a max.

This query successfully gave me buckets divided by 300s intervals and the
min/max y values within those buckets. Problem is, I have no way of linking
those min/max values to their corresponding 'x' value for plotting.

{
"size": 0,
"aggs" : {
"vals" : {
"filter" : { "term" : { "component" : "data_to_plot" } },
"aggs" : {
"values_over_time" : {
"date_histogram" : {
"field" : "x",
"interval" : "300s"
},
"aggs": {
"stats_y": {"stats": {"field": "y"} },

                }
            }
        }
    }
}          

}

Any suggestions?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/764262f4-41a9-4e1d-9485-ac1fa904825a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3