Date histogram based on two date-type variables

My apologies in advance if title of this topic is obscure, and I am not using right nomenclature. I am quite new to Kibana...

Using Kibana 5.4

  • I have in ElasticSearch some data, representing some type of events.
    Each event contains two date-type fields: 'start' and 'end'.

  • The content of 'start' is also the content of '@timestamp'.

  • Let's say that an event is 'active' between its 'start' and 'end' times.

I would like to create a Date Histogram, where each bin (or is it called bucket?) contains the total number of events that were active during that period of time. In other words, if the Nth bin goes from time t_N to time t_N+1, it should contains the number of events that

  1. start < t_N+1
  2. end > t_N

Is this doable?

Thanks a lot in advance.
Jose

Hi @jcaballero

Sounds like this post is answering a pretty similar question using scripted fields to achieve that: Display concurrency in data on Kibana

Let me know whether you can tweak the script to make it fit your use case

something like this will be hard, but i guess it could be possible by not doing date histogram but rather filters aggregation.
then define a filter for each of your ranges (like start >= now-7h AND end<= now-6h)

Thanks for your prompt responses. BTW, I just noticed my original post wasn’t correct describing which events go under each bin, but I am pretty sure you got it. I just fixed it.

hi @flash1293

I am having a look to that post you mentioned. I am still a little bit confused about it.
If I try to run the _search query example from that post, directly from command line, to start, I get an error message

curl -X GET "myhost:1234/concurrency/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>   "aggs": {
>     "my_histo": {
>       "date_histogram": {
>         "script": "start = doc['start'].value; duration = doc['duration'].value * 1000; l = []; for (long i = 0; i < duration; i += interval) { l.add(start + i); }; return l;",
>         "params": {
>           "interval": 3600
>         },
>         "interval": "hour"
>       }
>     }
>   }
> }
> '
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "[date_histogram] unknown field [params], parser not found"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "[date_histogram] unknown field [params], parser not found"
  },
  "status" : 400
}

I guess it was not supposed to be executed from command line.
How should I use that snippet (or similar) when building a "Date Histogram"?

Thanks,
Jose

Hey, it looks like the script is written in groovy, which got deprecated. The go-to solution is now painless. The script and it's params should go into their own nested objects like shown here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#_using_a_script_to_aggregate_by_day_of_the_week

{
  "aggs": {
    "my_histo": {
      "date_histogram": {
        "script": {
          "lang": "painless",
          "source": "def start = doc['start'].value.millis; def duration = doc['duration'].value * 1000; def l = []; for (long i = 0; i < duration; i += params.interval) { l.add(start + i); } return l;",
          "params": {
            "interval": 3600
          }
        },
        "interval": "hour"
      }
    }
  }
}

Of course the "start" and "duration" fields have to exist on your index for that to work

hi @flash1293 Thanks a lot for your prompt response.
I will keep investigating. Maybe I don't have the right version of Kibana and/or ElasticSearch. Still unsure how to make it work.
I created an index and put some data in it, following the example from the another thread, as:

curl -X PUT "my.es.host:1234/concurrency?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "logs": {
      "properties": {
        "start": {
          "type": "date"
        },
        "duration": {
          "type": "long"
        }
      }
    }
  }
}
'

curl -X PUT "my.es.host:1234/concurrency/logs/1?pretty" -H 'Content-Type: application/json' -d'
{
  "start": "2015-07-12T09:14:01Z",
  "duration": 10000
}
'

curl -X PUT "my.es.host:1234/concurrency/logs/2?pretty" -H 'Content-Type: application/json' -d'
{
  "start": "2015-07-12T10:20:50Z",
  "duration": 20000
}
'

curl -X PUT "my.es.host:1234/concurrency/logs/3?pretty" -H 'Content-Type: application/json' -d'
{
  "start": "2015-07-12T12:00:42Z",
  "duration": 5000
}
'

I then I tried to copy & paste your painless snippet into the Kibana Visualization (after removing the .millis, which I guess was introduced in a later version).

It fails, I get this explanation in the logs:

Caused by: org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: class_cast_exception: java.util.ArrayList cannot be cast to java.lang.Number

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.