Aggregate events with start and end date

nmeaux · December 28, 2018, 5:07pm

Hello,

I have events with fields start_date and end_date in epoch in the same document.
I want to sum fields values which are in the events on Y-axis (represented at the end with %) and X-axis will represent the time on which each events are between start_date and end_date.
I which to do it with Kibana.
I am looking around timelion and bucket script aggregation, but i am a little bit confused on how to do this.
By reading some quiet old posts, i found a method by pre-processing events and insert an array on each with a field "running_at" which just represent the date on which the event in running.
I guess with pipeline aggregation we could do it in a more efficient way, but i am stuck on how implement it.

Thx in advance for any kinds of help.

Nicolas

nmeaux · January 18, 2019, 10:46pm

Hello,

Replying to my self, i am looking arround an elasticsearch request in order to make a pre processing on my events documents and reinject the result into an other index, i have something like this for the moment :

POST /jobs/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "time_start": {
              "lte": 1540749600
            }
          }
        },
        {
          "range": {
            "time_end": {
              "gte": 1540749600
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "server": {
      "terms": {
        "field": "server.keyword"
      },
      "aggs": {
        "sum_alloc_cpu": {
          "sum": {
            "field": "alloc_cpu"
          }
        }
      }
    }
  }
}

So, with this request i have the couples of values i wish, but only of course on the specified epoch on which the event run.
I did not found find yet how i can make, lets say, 60 buckets, in order to make an aggregation of these 60 values into an average aggregation, which will be a bit less painfull rather than execute it 2592000 times (nb of seconds in one month).
I suspect i can make a painless script within the first aggregation ( and not use query here ), but i did not find how to arrange it.
If someone have an idea, i would be more than happy to ear it

rgds,

Nicolas

Christian_Dahlqvist · January 19, 2019, 9:52am

I wonder if you might be able to do this through a scripted aggregation, similar to the example in this old post.

nmeaux · January 22, 2019, 10:40pm

Hi Chritian,

The old post you mentionned was the first i did try for my issue, i did face to some limitations with painless script inside kibana due to the amount of objects i could have to manage ( ~3millions ).
By the way, i think i am near a solution :

GET /jobs/_search
{
  "size": 0,
  "aggs": {
    "server": {
      "terms": {
        "field": "server.keyword"
      },
      "aggs": {
        "jobs1": { .. },
        "jobs2": { .. }
      }
    },
    "sum_total": {
      "sum_bucket": {
        "buckets_path": "server>jobs1>sum_alloc_cpu"
      }
    }
  }
}

Where :

"jobsX": {
  "filter": {
    "bool": {
      "must": [
        {
          "range": {
            "time_start": {
              "lte": 153835200X
            }
          }
        },
        {
          "range": {
            "time_end": {
              "gte": 153835200X
            }
          }
        }
      ],
    }
  },
  "aggs": {
    "sum_alloc_cpu": {
      "sum": {
        "field": "alloc_cpu"
      }
    }
  }
}

I am stuck here to play with bucket_path and to give it a kind of wildcard like "server>jobs*>sum_alloc_cpu".

The response actually give is :

{
  "aggregations" : {
    "server" : {
      "buckets" : [
        {
          "key" : "server1",
          "jobs1" : {
            "alloc_cpu" : {
              "value" : 5.0
            }
          },
          "jobs2" : {
            "alloc_cpu" : {
              "value" : 8.0
            }
          }
        },
        {
          "key" : "server2",
          "jobs1" : {
              "value" : 7.0
            }
          },
          "jobs2" : {
            "alloc_cpu" : {
              "value" : 3.0
            }
          }
        },
        {
          "key" : "server3",
          "jobs1" : {
            "alloc_cpu" : {
              "value" : 4.0
            }
          },
          "jobs2" : {
            "alloc_cpu" : {
              "value" : 1.0
            }
          }
        }
      ]
    },
    "sum_total" : {
      "value" : 16.0
    }
}

I wish to be able to have the sum of these aggregation of : jobs1, jobs2, etc ... for alloc_cpu per key "server1, server2, ... "

I would definitely prefer to script generation of aggregation from jobs1 to jobsX , but for now, how can i be able to sum aggregation metric on the way i did mention ?

Thanks in advance.

rgds

Nicolas

nmeaux · January 31, 2019, 10:11pm

Going ahead on the issue i try to resolv, i found something interesting by using buckets script in this way

         "aggs": {
            "alloccpu": {
                "bucket_script": {
                    "buckets_path": {
                        "jobs1cpu": "jobs1>sum_alloc_cpu",
                        "jobs2cpu": "jobs2>sum_alloc_cpu"
                        "jobs3cpu": "jobs3>sum_alloc_cpu"
                    },
                    "script": {
                        "source": "(params.jobs1cpu+params.jobs2cpu+params.jobs3cpu)"
                    }
                }
            },
            "jobs1": {..},
            "jobs2": {..},
            "jobs3": {..},

The problem here, is when i have thousands of jobs i did get something like :

Caused by: java.lang.IllegalArgumentException: Scripts may be no longer than 16384 characters. The passed in script is 67313 characters. Consider using a plugin if a script longer than this length is a requirement.

I understand with Scripts may be no longer than 16384 characters - #3 by bfcshop , that i may be able to use elasticsearch 6.6 and increase the soft limit.
But i am looking for something which can be more optimized rather than actually.
Any ideas ?
Thx in advance
Rgds
Nicolas

system · February 28, 2019, 10:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Event start date + duration visualization Kibana	2	2143	March 23, 2018
How to double count events? Elasticsearch	5	1324	July 5, 2017
Timeseries with startDate-endDate. Aggregations on different date intervals Elasticsearch	1	430	January 11, 2018
Aggregate Value based on 2 Date/ time fields Kibana	2	282	October 11, 2019
Kibana timeseries for live events Kibana	2	759	August 18, 2017

Aggregate events with start and end date

Related topics