I shared your reply with the team, and they offered the following:
As an alternative to creating an
off_hours field by enriching the data at ingest time (via an ingest pipeline, or in your case, Logstash), you may consider experimenting with defining
off_hours as a Runtime field.
The following description of runtime fields is from the Runtime fields: Schema on read for Elastic blog post:
Runtime fields enable you to create and query fields that are evaluated only at query time. Instead of indexing all the fields in your data as it’s ingested, you can pick and choose which fields are indexed and which ones are calculated only at runtime as you execute your queries. Runtime fields support new use cases and you won’t have to reindex any of your data.
Exploring a solution based on runtime fields
Note: Runtime fields are beta functionality and subject to change. The details below were shared by a colleague who briefly experimented with runtime fields based on your use case. The information below does not describe a complete solution.
POST creates a runtime field a runtime field named
hour_of_day, which extracts just the hour portion from the
The goal of creating a runtime field like
hour_of_day in the example above, would be to refer to
hour_of_day in a detection rule's KQL query, e.g.
hour_of_day >= 22 OR hour_of_day <=5
However, a caveat to the script above is when a
date field is accessed via the
the timezone information is lost, and the
hourOfDay is always reported in UTC. This happens because
doc is using the indexed field, which under-the-hood is really just a number representing milliseconds since the epoch.
My colleague found that when fields are accessed via
params._source instead of
doc, per the example below, it's possible to get the original string value from the event, and extract the timezone information from it:
ZonedDateTime zdt = ZonedDateTime.parse(params._source['mytimestamp']);
The example above doesn't implement error handling for the case where timestamps can't be parsed. Without error handling, a single document containing a non-parsable timestamps could cause the entire query to fail.
Thus, neither of the examples above are a complete implementation of
hour_of_day as a runtime field, but they can serve as a starting point for experimentation.
Another colleague noted that even if
hour_of_day is implemented as a runtime field that reliably works across time zones so it can be referred to in KQL, the definition of "normal working hours" can still vary widely, even for users in the same time zone. It may be possible to pivot to a machine-learning-driven approach to this problem, where ML determines what "off hours" is based on the behavior on an individual user over time. The folks here are best-positioned to answer questions about an ML-driven approach, if that's an alternative you're willing to explore at this time.