Bucket script fails when some docs are missing

Elasticsearch version (bin/elasticsearch --version): 6.0.0-rc1

JVM version (java -version):
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
OS version (uname -a if on a Unix-like system):
Linux elasticsearch-data-hot-003 4.11.0-1013-azure #13-Ubuntu SMP Mon Oct 2 17:59:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
When doing a bucket script aggregation that depends on a cumulative sum aggregation of another sum aggregation, if the sum aggregation returns null values (Because there are no docs in that time interval bucket), the bucket script aggregation will also return null, instead of relying on the cumulative sum value that was gathered so far.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Add docs that span over 5 minutes that look like this:
{
  @timestamp: '',
  bytes: 100
}
  1. Run a query that spans after the 5m end (meaning that there will be date histogram buckets without docs), with this aggregation:
{
"aggs": {
    "timeseries": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1m",
        "min_doc_count": 0,
        "time_zone": "UTC"
      },
      "aggs": {
        "sum_bytes": {
          "sum": {
            "field": "bytes"
          }
        },
        "cumulative_bytes": {
          "cumulative_sum": {
            "buckets_path": "sum_bytes"
          }
        },
        "bucket": {
          "bucket_script": {
            "buckets_path": {
              "bytes": "cumulative_bytes"
            },
            "script": {
              "source": "params.bytes",
              "lang": "painless"
            }
          }
        }
      }
    }
  }
}
  1. Check the response and see that in the date histogram without buckets, the bucket aggregation does not show the value that its supposed to (The cumulative_bytes value). (It doesn't exist for those time buckets)

It looks like your cumulative_sum aggregation has a buckets_path of sum_bytes but your sum aggregation is actually called sum_http. I think that might be the reason its not working?

Its just a typo, fixed it.

Is this a typo as well? Should there be a date here?

Its not a typo, well, the aggregation requires more than 1 document, and I didn't want to create too big of a thread, so its a placeholder for the timestamp that the one that will help me debug it will use

Let me add all the docs.

I have tried to reproduce this using the provided gist (first file are the requests and the second file is the output of the search request. I don't see that there is any problem with the output? Maybe you could provide an edited version of that script that shows the problem and/or point out what is unexpected in the output for you? https://gist.github.com/colings86/7f9e1cd4670f517364679f322a535628

Actually I see what you mean now, the empty buckets dont have a value for the bucket script aggregation. This is a bug, I'll raise an issue on the GH repo so it can be fixed

@colings86 the bug exists in your output :slight_smile:

If you look closely, there is no bucket aggregation in all time buckets that don't have documents for.

So you only have buckets for 2017-01-01T00:00:00.000Z & 2017-01-01T00:05:00.000Z

I would expect it to have the bucket aggregation to all time buckets.

Don't you agree?

@colings86 I already raised this issue: https://github.com/elastic/elasticsearch/issues/27377 but it was closed by your team :slight_smile:

You can reopen that one

Ok I have reopened the issue and commented with my recreation. Thanks for raising this and sorry there was some confusion as to whether this was a bug or not.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.