Top trend/variation in terms aggregation

VanMicky · March 30, 2018, 3:03pm

Hello,

Let's say I have a date histogram over a terms aggregation :

{
  "size": 0,
  "aggs": {
    "my_date_histo": {
      "date_histogram": {
        "field": "date",
        "interval": "month"
      },
      "aggs": {
        "categories": {
          "terms": {
            "field": "category"
          }
        }
      }
    }
  }
}

With returns something like:

"aggregations": {
    "my_date_histo": {
      "buckets": [
        {
          "key_as_string": "2018-02-01T00:00:00.000Z",
          "categories": {
            "buckets": [
              {
                "key": "A",
                "doc_count": 10
              },
              {
                "key": "B",
                "doc_count": 5
              },
              {
                "key": "C",
                "doc_count": 0
              }
            ],
          }
        },
        {
          "key_as_string": "2018-03-01T00:00:00.000Z",
          "categories": {
            "buckets": [
              {
                "key": "A",
                "doc_count": 10
              },
              {
                "key": "B",
                "doc_count": 4
              },
              {
                "key": "C",
                "doc_count": 3
              }
            ],
          }
        }
      ],
    }

I would like to know the top terms variations (positive and negation) between the two dates, something like:

 "categories": {
            "buckets": [
              {
                "key": "C",
                "doc_count": 3
              },
              {
                "key": "A",
                "doc_count": 0
              },
              {
                "key": "B",
                "doc_count": -1
              }
            ]
}

Any idea of how to do such aggregation?

dadoonet · March 30, 2018, 3:17pm

May be pipeline aggregations can do that.

VanMicky · March 30, 2018, 3:44pm

I actually found a way. Not very pretty, but the following will returns the top positive trend.

{
  "size": 0,
  "aggs": {
    "genres": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "sales_bucket_sort": {
          "bucket_sort": {
            "sort": [
              {"histo>thirtieth_difference": {"order": "desc"}}
            ],
            "size": 10,
            "gap_policy": "insert_zeros"
          }
        },
        "histo": {
          "date_histogram": {
            "field": "date",
            "interval": "month"
          },
          "aggs": {
            "thirtieth_difference": {
              "serial_diff": {
                "buckets_path": "_count",
                "lag": 1
              }
            }
          }
        }
      }
    }
  }
}

system · April 27, 2018, 3:58pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.