Aggragate with a script yields different results

yavidor · August 27, 2019, 7:08am

Hi,
im trying to get the average of the time difference between 2 time fields with the following query:

GET x/_search
{
  "_source": ["ts", "@timestamp"], 
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "@timestamp": {
        "order": "desc"
      }
    }
  ],
   "size": 1,
   "script_fields" : {
        "test1" : {
            "script" : {
                "lang": "painless",
                "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000"
            }
        }
   }
   , "aggs": {
     "avarge_delay": {
       "avg": {
         "script": {
                "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000",
                "lang": "painless"
            }
       }
     }
   }
  
}

the "test1" scripted field gives the wanted result. but the aggregation is way off of the average (i gave it size =1 ).
as you can see i used the exact same script

if i return only one document size=1 i get test=4000 and the aggragated value is 1000 times larger

ill be happy if anyone can point me to my mistake

thats when quering a size=1

abdon · August 27, 2019, 12:26pm

Your script field will return you the value of @timestamp - ts in seconds for that one document that you are returning. That's what script fields do, they return a script calculated value per hit.

The avg aggregation will run on all documents and return you the average value of @timestamp - ts across all docs. Setting size=1 has no influence on the aggregation scope. That scope is determined by all documents that match your query (a match_all query in your case).

yavidor · August 27, 2019, 12:32pm

Thank you for your reply!
im fairly new at elasticsearch. in order to get the average of the last 200 documents by @timestam. i should filter it somehow under the "match all" field?
how can the avg be only for the last 200?

thank you again for helping!

abdon · August 27, 2019, 2:15pm

What you're asking is not trivial

If you wanted to simply to get the average over a specific time period, then that would be easy. You could use a range query that limits the documents to just those of for example the last two hours, using something called date math (now-2h in the example below):

GET x/_search
{
  "size": 0,
  "query": {
    "range": {
      "@timestamp": {
        "gt": "now-2h"
      }
    }
  },
  "aggs": {
    "average_delay": {
      "avg": {
        "script": {
          "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000",
          "lang": "painless"
        }
      }
    }
  }
}

If you really want to aggregate on a specific number of documents then you would have to do some advanced stuff. You could use a distance_feature query (new in Elasticsearch 7.2) to give your documents a score based on how old they are. You could combine that with a sampler aggregation that limits the scope to just a specific number of highest scoring documents (ie. most recent documents). That request would look something like this:

{
  "size": 0,
  "query": {
    "distance_feature": {
      "field": "@timestamp",
      "pivot": "730d",
      "origin": "now"
    }
  },
  "aggs": {
    "sample": {
      "sampler": {
        "shard_size": 200
      },
      "aggs": {
        "average_delay": {
          "avg": {
            "script": {
              "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000",
              "lang": "painless"
            }
          }
        }
      }
    }
  }
}

Note: that last request will only work if you are querying a single index with a single shard (the default in Elasticsearch 7).

yavidor · August 27, 2019, 2:57pm

Thanks so much for your assictance again!
first and simple method would be more than enough
unfortunately that didn't solve my problem. i ran the following:

GET X/_search
{
  "_source": ["ts", "@timestamp"],
  "size": 1,
  "query": {
    "range": {
      "@timestamp": {
        "gt": "now-2h"
      }
    }
  },
  "script_fields" : {
        "test1" : {
            "script" : {
                "lang": "painless",
                "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000"
            }
        }
   },
  "aggs": {
    "average_delay": {
      "avg": {
        "script": {
          "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000",
          "lang": "painless"
        }
      }
    }
  }
}

and got the same results. the value of the aggregated doesn't make sence at all...
its bizzare because test 1 yields the correct value..

thats the outcome:

{
  "took": 402,
  "timed_out": false,
  "_shards": {
    "total": 52,
    "successful": 52,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2996939,
    "max_score": 1,
    "hits": [
      {
        "_index": "X",
        "_type": "logs",
        "_id": "11Qf02wBfoJnQ0YMzlqG",
        "_score": 1,
        "_source": {
          "@timestamp": "2019-08-27T12:49:32.086Z",
          "ts": "2019-08-27T12:49:31,320"
        },
        "fields": {
          "test1": [
            0
          ]
        }
      }
    ]
  },
  "aggregations": {
    "average_delay": {
      "value": 16937778.22412268
    }
  }
}

do you have any guesses?
Thanks again for your help!

abdon · August 27, 2019, 3:00pm

You're comparing apples and oranges again. The script field returns you the average in one document. The aggregation the average over the last two hours of data.

yavidor · August 27, 2019, 3:05pm

i know. but thats not the average for the last 2 hours.. the last 2 hours show only 0 in the test field, so average should be around 0. thats why it the average doesn't make sense.

if i change the time to 1m i still get a huge number. and in the last minute all test1 values are 0

yavidor · August 28, 2019, 7:13am

this indeed was a good solution. but the script acts different while under the aggregation.
if you have any idea what could be the problem, that will be great. i marked it as a solution for now

abdon · August 28, 2019, 7:37am

My guess would be that you may have a data quality issue. Maybe there is a document with a very high difference between @timestamp and ts? This would skew the average to return that high number.

yavidor · August 31, 2019, 8:35pm

Well i left it for a while but i think i found the problem.
doc['ts'] gives the milliseconds from epoch.
> doc['@timestamp] gives milliseconds from epoch also
but when i try to subtract them i get:

"type": "class_cast_exception",
"reason": "class_cast_exception: Cannot apply [-] operation to types [org.elasticsearch.index.fielddata.ScriptDocValues.Dates] and [org.elasticsearch.index.fielddata.ScriptDocValues.Dates]."

doc['ts'].value gives 0
doc['@timestamp'].value still gives millis from epoch

and when tring to substrat them (without adding .millis) i get an error:

"type": "class_cast_exception",
"reason": "class_cast_exception: Cannot apply [-] operation to types [org.joda.time.MutableDateTime] and [org.joda.time.MutableDateTime]."

like adding the value changes one of the types.. in elasticsearch they are both set as dates.

Do you have any idea what would cause such a thing? this is the reason i got numbers that dont add up. one of the fields was giving a zero.
any way to format them to be the same and make the subtraction work?

Thank you very much for your help so far

abdon · September 1, 2019, 9:32am

If doc['ts'].value returns 0 then I would think you've got a document with a ts field with a value of 0. This would be the data quality issue I was referring to.

Let's check a few things. Can you paste the output of:

GET X/_mapping

Let's also take a look at the docs that have the lowest and highest values of both ts and @timestamp. Can you paste the output of:

GET X/_msearch
{}
{"_source":["ts","@timestamp"],"size":1,"query":{"range":{"@timestamp":{"gt":"now-2h"}}},"sort":[{"ts":{"order":"asc"}}]}
{}
{"_source":["ts","@timestamp"],"size":1,"query":{"range":{"@timestamp":{"gt":"now-2h"}}},"sort":[{"ts":{"order":"desc"}}]}
{}
{"_source":["ts","@timestamp"],"size":1,"query":{"range":{"@timestamp":{"gt":"now-2h"}}},"sort":[{"@timestamp":{"order":"asc"}}]}
{}
{"_source":["ts","@timestamp"],"size":1,"query":{"range":{"@timestamp":{"gt":"now-2h"}}},"sort":[{"@timestamp":{"order":"desc"}}]}

yavidor · September 1, 2019, 10:28am

"ts": {
            "type": "date"
          }

 "properties": {
          "@timestamp": {
            "type": "date"
          },

and for the query:

{
  "responses": [
    {
      "took": 471,
      "timed_out": false,
      "_shards": {
        "total": 13,
        "successful": 12,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 18194874,
        "max_score": null,
        "hits": [
          {
            "_index": "X",
            "_type": "logs",
            "_id": "v-Af6mwBFrPs1dZnQTpj",
            "_score": null,
            "_source": {
              "@timestamp": "2019-09-01T00:00:01.561Z",
              "ts": "2019-09-01T00:00:00,001"
            },
            "sort": [
              1567296000001
            ]
          }
        ]
      },
      "status": 200
    },
    {
      "took": 470,
      "timed_out": false,
      "_shards": {
        "total": 13,
        "successful": 12,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 18194874,
        "max_score": null,
        "hits": [
          {
            "_index": "X",
            "_type": "logs",
            "_id": "5d_S62wBfoJnQ0YMbO6D",
            "_score": null,
            "_source": {
              "@timestamp": "2019-09-01T07:48:58.298Z",
              "ts": "2019-09-01T07:48:57,793"
            },
            "sort": [
              1567324137793
            ]
          }
        ]
      },
      "status": 200
    },
    {
      "took": 136,
      "timed_out": false,
      "_shards": {
        "total": 13,
        "successful": 12,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 18194874,
        "max_score": null,
        "hits": [
          {
            "_index": "X",
            "_type": "logs",
            "_id": "vT8f6mwBfoJnQ0YMPSVJ",
            "_score": null,
            "_source": {
              "@timestamp": "2019-09-01T00:00:01.214Z",
              "ts": "2019-09-01T00:00:00,008"
            },
            "sort": [
              1567296001214
            ]
          }
        ]
      },
      "status": 200
    },
    {
      "took": 119,
      "timed_out": false,
      "_shards": {
        "total": 13,
        "successful": 12,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 18194874,
        "max_score": null,
        "hits": [
          {
            "_index": "X",
            "_type": "logs",
            "_id": "5d_S62wBfoJnQ0YMbO6D",
            "_score": null,
            "_source": {
              "@timestamp": "2019-09-01T07:48:58.298Z",
              "ts": "2019-09-01T07:48:57,793"
            },
            "sort": [
              1567324138298
            ]
          }
        ]
      },
      "status": 200
    }
  ]
}

i extended it to 24 hours since i had no docs from the last 2 hours.
seems like theres no data quality issue. if i could just subtract them it would work fine

thanks again

abdon · September 1, 2019, 11:44am

I don't understand what's going on here . If you run the avg aggregation on the last 24 hours of data, does it give you that unusual large number?

The only thing that looks off to me is the format of your ts dates. Those dates are not formatted like @timestamp, and are missing a timezone. A date like 2019-09-01T00:00:00,001 is not formatted in a way that Elasticsearch would accept by default.

yavidor · September 1, 2019, 12:26pm

whenever i try to do arithmetic operations i get an error.
only if i add .value.millis I can do the operations, but then ts value is 0 and thats why i get bad results.

you have an idea on how to format it correctly?

Melvyn · September 2, 2019, 12:05pm

The problem is not on the script side, it is in your data:

PUT test/_doc/1
{
  "@timestamp": "2019-08-27T12:49:32.000Z",
  "ts": "2019-08-27T12:49:30.000Z"
}

I'm indexing a single document with the correct date format. And now I'm going to run the exact same aggregation, that you used:

GET test/_search
{
  "script_fields": {
    "test1": {
      "script": {
        "lang": "painless",
        "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000"
      }
    }
  },
  "aggs": {
    "average_delay": {
      "avg": {
        "script": {
          "source": "(doc['@timestamp'].value.millis - doc['ts'].value.millis) / 1000",
          "lang": "painless"
        }
      }
    }
  }
}

The output will be the following:

{
  "took" : 89,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "test1" : [
            2
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "average_delay" : {
      "value" : 2.0
    }
  }
}

As you can see the value from the document and from the bucket is the same. Now let's add a second document, with a 6 seconds difference:

PUT test/_doc/2
{
  "@timestamp": "2019-08-27T12:49:36.000Z",
  "ts": "2019-08-27T12:49:30.000Z"
}

I run the exact them search than above and got the following results, which again are the one we expect:

{
  "took" : 806,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "test1" : [
            2
          ]
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "test1" : [
            6
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "average_delay" : {
      "value" : 4.0
    }
  }
}

yavidor · September 2, 2019, 1:26pm

thanks for your help.
we established that the problem is the data. im trying to find a solution.
is there a way to treat it somehow? cast another type? do anything?

Melvyn · September 2, 2019, 3:20pm

I don't really know your data ingestion process, but my recommendations will be to use an ingest pipeline to modify your date format.

The problem in this thread was: "Aggregate with a script yields different results" and it seems to be solved. I would recommend opening a new question for your new problem, which is now a date or data formatting issue.

rjernst · September 3, 2019, 6:11pm

@yavidor Are you sure your data is clean? Is it possible you have some documents that do not have the ts field?

doc['ts'].value gives 0
doc['@timestamp'].value still gives millis from epoch

This would happen in older version (before 7.0) of Elasticsearch if you are missing the ts field in a document. doc['ts'] is essentially a List of values, so you need to check doc['ts'].size().

yavidor · September 3, 2019, 7:29pm

that indeed was the problem.
i solved it by filtering on the non existing fields

GET X/_search
{
  "_source": ["ts", "@timestamp"],
  "size": 0, 
  "query": {
    "bool": {
      "filter": [{
        "range": {
          "@timestamp": {
            "gte": "now-1h"
          }
        }
        },
        {"exists": {
          "field": "ts"
        }
        }
      ]
    }
  },
    
      
  "aggs": {
    
    "avg_delay": {
      "avg": {
        "script": {
          "source": "(((doc['@timestamp'].value.millis - doc['ts'].value.millis)/1000))",
          "lang": "painless"
        }
      }
    }
  }
}

Thanks for everyone who tried to help out

system · October 1, 2019, 7:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregate every field Elasticsearch	1	306	January 9, 2021
Script Aggregations Significant Performance Degradation in 2.2.0 Elasticsearch	16	2650	July 5, 2017
Get value of a list Elasticsearch	3	518	January 6, 2017
Calculating the avg time between two time stamps Elasticsearch	3	2229	July 6, 2017
Apply script on aggregation bucket Elasticsearch	1	420	February 8, 2018

Aggragate with a script yields different results

Related topics