Error for scripted_metric aggregation on auto_date_histogram aggregation

saurab · February 9, 2022, 3:05pm

I am working with the sample log dataset. Here is what it looks like:

"hits" : [
      {
        "_index" : "kibana_sample_data_logs",
        "_type" : "_doc",
        "_id" : "nfvc3X4BKbUZuD5M4wWJ",
        "_score" : 1.0,
        "_source" : {
          "agent" : "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)",
          "bytes" : 4155,
          "clientip" : "57.65.101.133",
          "host" : "artifacts.elastic.co",
          "index" : "kibana_sample_data_logs",
          "ip" : "57.65.101.133", 
          "message" : "57.65.101.133 - - [2018-07-25T14:13:30.450Z] \"GET /beats/filebeat/filebeat-6.3.2-linux-x86_64.tar.gz HTTP/1.1\" 200 4155 \"-\" \"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)\"",
          "phpmemory" : null,
          "timestamp" : "2022-02-02T14:13:30.450Z",
          "url" : "https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.3.2-linux-x86_64.tar.gz",
          "utc_time" : "2022-02-02T14:13:30.450Z",
          "event" : {
            "dataset" : "sample_web_logs"
          }
        }
      },

I want to find out the maximum number of bytes (max_bytes) transferred for documents bucketed by a date histogram. In other words for all document that falls into a bucket created by the date histogram, I want to find the max bytes using the map/combine/reduce scripted_metric aggregation.

Here is what I tried:

GET kibana_sample_data_logs/_search?size=0
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "doc_buckets_for_date_histogram": {
      "auto_date_histogram": {
        "field": "timestamp", 
        "buckets": 10
      }, 
      "aggs": {
        "max_bytes": {
          "scripted_metric": {
            "init_script": "state.max_bytes = 0L;", 
            "map_script": """ 
              def current_bytes = doc['bytes'].getValue();
              if (current_bytes > state.max_bytes)
                {state.max_bytes = current_bytes;}
            """,
            "combine_script": "return state", 
            "reduce_script": """ 
              def max_bytes = 0L;
              for (s in states) {if (s.max_bytes > (max_bytes))
                {max_bytes = s.max_bytes;}}
              return max_bytes
            """
          }
        }   
      }
    }
  }
}

But it is giving an error:

{
  "error" : {
    "root_cause" : [ ],
    "type" : "search_phase_execution_exception",
    "reason" : "",
    "phase" : "fetch",
    "grouped" : true,
    "failed_shards" : [ ],
    "caused_by" : {
      "type" : "script_exception",
      "reason" : "runtime error",
      "script_stack" : [
        "if (s.max_bytes > (max_bytes))\n                {",
        "     ^---- HERE"
      ],
      "script" : "  ...",
      "lang" : "painless",
      "position" : {
        "offset" : 74,
        "start" : 69,
        "end" : 117
      },
      "caused_by" : {
        "type" : "illegal_argument_exception",
        "reason" : "dynamic getter [java.lang.Long, max_bytes] not found"
      }
    }
  },
  "status" : 400
}

It seems to me that the reduce script is not getting the s.max_bytes argument. I dont know why not.
Please help.

Tomo_M · February 9, 2022, 5:11pm

I'm not sure why, it worked for terms bucket aggregation.
But It failed for histogram aggregation as for date histogram aggregation.

Something is wrong when s, which is supposed to be a HashMap (or possibly null), has become Long.
Could this be a bug of scripted_metric aggregation on both histogram aggregations??

GET kibana_sample_data_logs/_search?size=0
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "doc_buckets_for_date_histogram": {
      "terms": {
        "field": "agent.keyword"
      },
      "aggs": {
        "max_bytes": {
          "scripted_metric": {
            "init_script": "state.max_bytes = 0L;", 
            "map_script": """ 
              def current_bytes = doc['bytes'].getValue();
              if (current_bytes > state.max_bytes)
                {state.max_bytes = current_bytes;}
            """,
            "combine_script": "return state", 
            "reduce_script": """ 
              def max_bytes = 0L;
              for (s in states) {if (s.max_bytes > (max_bytes))
                {max_bytes = s.max_bytes;}}
              return max_bytes
            """
          }
        }   
      }
    }
  }
}

I found a similar post of several years ago left unsolved...

Tomo_M · February 9, 2022, 5:32pm

When I check null bucket (insert if (Objects.isNull(s)){max_bytes=max_bytes} else), histogram and datehistogram aggregation become works well.
But your auto_date_histogram doesn't work and still raise same error. It could be something specific to scripted_metric aggregation on auto_date_histogram.

scripted_metric aggregation with null check on date_histogram aggregation works well:

GET kibana_sample_data_logs/_search?size=0
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "doc_buckets_for_date_histogram": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "month"
      }, 
      "aggs": {
        "max_bytes": {
          "scripted_metric": {
            "init_script": "state.max_bytes = 0L;", 
            "map_script": """ 
              def current_bytes = doc['bytes'].getValue();
              if (current_bytes > state.max_bytes)
                {state.max_bytes = current_bytes;}
            """,
            "combine_script": "return state", 
            "reduce_script": """ 
              def max_bytes = 0L;
              for (s in states) {if (Objects.isNull(s)){max_bytes=max_bytes} else if (s.max_bytes > (max_bytes))
                {max_bytes = s.max_bytes;}}
              return max_bytes
            """
          }
        }   
      }
    }
  }
}

saurab · February 10, 2022, 11:18am

Thank you ever so much for your effort in running the code in your own environment before responding to my message. I appreciate it a lot. I got the same result on running your code. It works with date_histogram but not auto_date_histogram. Thank you again.

system · March 10, 2022, 11:18am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible for buckets_path to reference a scripted_metric aggregation? Elasticsearch	1	319	March 11, 2022
Scripted_metric in Kibana visualization Kibana	6	1222	March 27, 2017
Using an aggregation to create histogram in Kibana? Elasticsearch	2	257	March 10, 2022
Too many aggregation buckets Elasticsearch	3	3070	July 5, 2017
How to calculate & draw metrics of scripted metrics Kibana vega	2	624	June 27, 2022

Error for scripted_metric aggregation on auto_date_histogram aggregation

Related topics