Pipeline aggregation: date histogram + terms


#1

Hi,

I would like to dynamically build fixed size buckets over the dates (date_histogram). Then, for every date range, I would like buckets dynamically built one per term. So the result would be something like that:

[
  {
    "timestamp": "month0",
    "tags": [
       "blue": 41,
       "red": 35,
       "green": 21
    ]
  },
  ...
  {
    "timestamp": "monthN",
    "tags": [
       "blue": 41,
       "red": 35,
       "green": 21
    ]
  },
]

In order to achieve that I did the following pipeline aggregation, but it doesn't work:

{
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "month"
            },
            "aggs": {
                "tags": {
                    "terms": {
                        "field": "tags"
                    }
                }
            }
        },
        "monthly_tags": {
            "bucket": {
                "buckets_path": "sales_per_month>tags" 
            }
        }
    }
}

Any tips?
Thanks


(Daniel Mitterdorfer) #2

Hi @arno-london,

isn't this doing what you want?

DELETE /test

PUT /test
{
    "mappings": {
        "type": {
            "properties": {
                "timestamp": {
                    "type": "date"
                },
                "tags": {
                    "type": "keyword"
                }
            }
        }
    }
}

POST /test/type/_bulk
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-01-01", "tags": ["red", "green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-01-01",	"tags": ["blue"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-01-01",	"tags": ["blue"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-01-01",	"tags": ["blue"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-01-01",	"tags": ["green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-02-01",	"tags": ["green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-02-01",	"tags": ["green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-02-01",	"tags": ["green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-02-01",	"tags": ["green"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-02-01",	"tags": ["red"]}
{ "index": {"_index": "test", "_type": "type"} }
{"timestamp": "2016-03-01",	"tags": ["blue"]}

GET /test/type/_search
{
    "query": {
        "match_all": {}
    }, 
    "size": 0, 
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "timestamp",
                "interval" : "month"
            },
            "aggs": {
                "tags": {
                    "terms": {
                        "field": "tags"
                    }
                }
            }
        }
    }
}

This produces:

{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 11,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2016-01-01T00:00:00.000Z",
               "key": 1451606400000,
               "doc_count": 5,
               "tags": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "blue",
                        "doc_count": 3
                     },
                     {
                        "key": "green",
                        "doc_count": 2
                     },
                     {
                        "key": "red",
                        "doc_count": 1
                     }
                  ]
               }
            },
            {
               "key_as_string": "2016-02-01T00:00:00.000Z",
               "key": 1454284800000,
               "doc_count": 5,
               "tags": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "green",
                        "doc_count": 4
                     },
                     {
                        "key": "red",
                        "doc_count": 1
                     }
                  ]
               }
            },
            {
               "key_as_string": "2016-03-01T00:00:00.000Z",
               "key": 1456790400000,
               "doc_count": 1,
               "tags": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "blue",
                        "doc_count": 1
                     }
                  ]
               }
            }
         ]
      }
   }
}

Daniel


#3

Yes! Many thanks!

There is something I didn't understand with the pipeline aggregation, I have to read the documentation again.


(Daniel Mitterdorfer) #4

Hi @arno-london,

great that I could help you! :slight_smile:

Daniel


#5

A big help!
Have a great day Daniel!


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.