Date histogram on objects with a start and end time

Hello everyone, first time poster here. Please make me aware of any mishaps I might have done.

I'll use a specific usecase to illustrate my question, but I would love a more general answer as I just got into Kibana.

Elastic's version is 7.17.7 (rip Timelions).

Context

I work on a Spark monitoring library. During a job's life time, this library sends message to Elasticsearch containing some useful variables, and are then displayed in a Kibana Dashboard.

Below is an example of a message concerning a spark executors sent when it dies :

{
  "timestamp" : 1650966972460, // mapped as date
  "sourceAppName" : "RandomTestApp", // keyword
  "sourceAppId" : "local-1650966959680", // keyword
  "sourceAppLabel" : "RandomTestApp#2022-04-26_11:55:34",
  "dataClass" : "ExecutorEntry", // keyword
  "executorEntry" : {
    "executorId" : "local-1650966959680:driver",
    "executorHost" : "localhost", // keyword
    "totalCores" : 8, // int
    "executorStartTime" : 1650966959724, // mapped as date
    "executorEndTime" : 1650966972460, // mapped as date
    "removedReason" : "Application ended.", // str
    "totalExecutorRunTime" : 14394, // ms
    "totalExecutorCpuTime" : 4888, // ms
    "totalExecutorBytesRead" : 37608, // B
    "totalExecutorBytesWritten" : 0, // B
    "executorPeakMemory" : 1310720, // B
    "executorAverageMemory" : 818152, // B
    "executorNetAllocatedMemory" : 2000000000, // B
    "executorGrossAllocatedMemory" : 3000000000, // B
    "executorStatus" : -1 // 1 for ACTIVE, -1 for INACTIVE
  }
}

Questions

Now for the Kibana question :

Let's say I want to see the number of equivalent cores per application on a date histogram.
In my mind, the best result would answer to the following procedure :

For every timestamp t_n, most likely one per 15min :

  • Query every executorEntry where executorEntry.executorStartTime <= t_n and executorEntry.executorEndTime >= t_n
  • Get executorEntry/totalCores from them
  • Break down (aggregate) per sourceAppLabel
  • Show the date histogram as bars stacked.

Could this be done via the Lens function ? How so ? Is there something like the Expression Editor in Canvas that could be included in a Dashboard environment ?

I though of timelions that could provide a language to do just that, but I heard they have been deprecated. What would replace them ? I think Vega might be overkill, but perhaps I am just scared of the barrier of entry. Would love some pointers.

I know the question is quite open ended as I am pretty new with the technology, so I'll be happy to answer anything that might be of use. Feel free to recommend good practices too !

Given how your data is structured I would try this in Lens from a Dashboard (you can also import visualizations to Canvas with the "Add from Kibana" button in 7.17--the editors are much more similar in later versions).

Set the filter you specify executorEntry.executorStartTime <= t_n and executorEntry.executorEndTime >= t_n in the query bar

Get executorEntry/totalCores from them

sounds like a good use of the Sum quick function

Break down (aggregate) per sourceAppLabel

Set this as the break down by. Adjust number of terms and rank order.

Hello Graham ! Thanks for the response !

How would I indicate to the date histogram that the t_n is the current timestamp ?( if I indicate to the date histogram that I want the data to be bucketed with a 30min interval, it should do that every 30 min to populate the bar stacked graph).

For example, if :

  • A.executorStartTime = 10:25 and A.executorEndTime = 10h55
  • B.executorStartTime = 10:35 and B.executorEndTime = 11h15

Then on the graph :

  • Bucket 09h30 -> 10h00 : 0
  • Bucket 10h00 -> 10h30 : A
  • Bucket 10h30 -> 11h00 : A + B
  • Bucket 11h00 -> 11h30 : B
  • Bucket 11h30 -> 12h00 : 0

ah I get it now I don't think you have access to the date histogram's current interval here so Vega might be a good option.

if you don't need the exact interval of the visualization you could write a runtime field to check within a sort of sliding window i.e. executorEntry.executorStartTime <= (last 15 minutes from timestamp) and executorEntry.executorEndTime >= (15 minutes after timestamp) but it's not a perfect solution for what you've asked here.

I researched about runtime fields, but I failed to see how they could be used to populate a bar stacked graph.

So I guess my only solution is Vega then. Would you happen to have some pointers on what a Vega solution would look like ? (Maybe some examples of code with a similar workflow).
It is quite noob unfriendly ! Haha.

My team is not hostile to a transition on Canvas. Could this be done inside the expression editor ? (again if you have some pointers I would appreciate it gladly).

May I rekindle the subject ?

Fitting objects with a start and end time into a time histogram seems like a pretty general problematic. Surely there is a straightforward way to do this, right ?

Looking for guidance !