Bucket Selector Aggregation on Date Histogram _key

magrossi · April 3, 2017, 10:31am

Hi,
I'm trying to use the bucket selector aggregation to filter out some unwanted buckets from my response, buckets that I'm not interested in (but are used in the calculation - moving avg). My main aggregation is a date_histogram and therefore the _key is a date. I have tried the following without success:

    "_m_pipeline": {
      "date_histogram": {
        "field": "date",
        "interval": "month"
      },
      "aggs": {
        "_m_sum": {
          "sum": {
            "field": "value"
          }
        },
       ... <other aggregations here> ...
        "_m_data_bucket_filter": {
          "bucket_selector": {
            "buckets_path": {
              "bucket_key": "_key"
            },
            "script": "params.bucket_key < 1491004800000L" // This number is equivalent to 2017-04-01 00:00:00
          }
        }

The result I get is:

{
  "error": {
    "root_cause": [],
    "type": "reduce_search_phase_exception",
    "reason": "[reduce] ",
    "phase": "fetch",
    "grouped": true,
    "failed_shards": [],
    "caused_by": {
      "type": "aggregation_execution_exception",
      "reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.joda.time.DateTime"
    }
  },
  "status": 503
}

The main idea of my query is to get a monthly moving average of window 12. In order to calculate this, I need at least 12 months of data in my histogram, but I am only interested in the latest value of the moving average, therefore I would like to discard (server side) the excess data for post processing.

Any ideas if this is possible? And how?

Thank you!

colings86 · April 3, 2017, 10:52am

Could you have a look for a stack trace in the Elasticsearch server logs and paste it here?

magrossi · April 3, 2017, 11:00am

Hi,
Here you go.

[2017-04-03T10:58:32,634][DEBUG][o.e.a.s.TransportSearchAction] [fNx6XyA] failed to reduce search
org.elasticsearch.action.search.ReduceSearchPhaseException: [reduce]
at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction$2.onFailure(SearchQueryThenFetchAsyncAction.java:151) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:512) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-5.1.2.jar:5.1.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: org.elasticsearch.search.aggregations.AggregationExecutionException: buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.joda.time.DateTime
at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:171) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:152) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.pipeline.bucketselector.BucketSelectorPipelineAggregator.reduce(BucketSelectorPipelineAggregator.java:100) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:136) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:158) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms$Bucket.reduce(InternalTerms.java:135) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms.doReduce(InternalTerms.java:234) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:134) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:158) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.search.SearchPhaseController.merge(SearchPhaseController.java:489) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction$2.doRun(SearchQueryThenFetchAsyncAction.java:140) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.1.2.jar:5.1.2]
... 3 more
[2017-04-03T10:58:32,635][WARN ][r.suppressed ] path: /test2/full/_search, params: {index=test2, type=full}
org.elasticsearch.action.search.ReduceSearchPhaseException: [reduce]
at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction$2.onFailure(SearchQueryThenFetchAsyncAction.java:151) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:512) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-5.1.2.jar:5.1.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: org.elasticsearch.search.aggregations.AggregationExecutionException: buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.joda.time.DateTime
at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:171) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:152) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.pipeline.bucketselector.BucketSelectorPipelineAggregator.reduce(BucketSelectorPipelineAggregator.java:100) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:136) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:158) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms$Bucket.reduce(InternalTerms.java:135) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms.doReduce(InternalTerms.java:234) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:134) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:158) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.search.SearchPhaseController.merge(SearchPhaseController.java:489) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction$2.doRun(SearchQueryThenFetchAsyncAction.java:140) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.1.2.jar:5.1.2]
... 3 more

Regards.

colings86 · April 3, 2017, 2:09pm

This seems to be a bug unfortunately. I have opened https://github.com/elastic/elasticsearch/issues/23874 to get the problem fixed but basically we have a requirement at the moment that all bucket paths resolve to a numeric value (as in an instance of java.lang.Number). This is important for most of the pipeline aggregations since, by design, they work only on numeric values. However in the case of bucket_script and bucket_selector we should allow other types so long as the script can support them (essentially this means allowing DateTime and String values as well).

Unfortunately I can't think of a way to work around this at the moment, sorry.

magrossi · April 3, 2017, 2:24pm

Thanks for the quick feedback. I'll follow that opened issue closely but will have to deal with this on client side for now I guess.
Regards.

system · May 1, 2017, 2:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need to access DATA_HISTOGRAM buckets response keys in child aggregation Elasticsearch	1	385	June 4, 2020
Pipeline aggregation with Date histogram doesn't return expected result Elasticsearch	2	365	April 8, 2019
Date_histogram buckets not as expected Elasticsearch	10	911	March 30, 2017
Date histogram aggregation issue for arrays fields Elasticsearch	2	275	February 26, 2022
Filter by histogram bucket key Elasticsearch	5	3204	October 24, 2017

Bucket Selector Aggregation on Date Histogram _key

Related topics