How to define time range in custom query rule in elasticsiem?

Hi,

I'm trying to create rule which should fire on admin activity when time of event is from 22:00 till 5:00.
I tried this query:

user.group.name :*admin* and ((@timestamp >= "00:00:00" and @timestamp <= "05:00:00" ) or (@timestamp >= "22:00:00" and @timestamp <= "00:00:00"))

But it says "failed to parse date field". So how to fix it?

Thank you.

1 Like

You should join the slack for elastic https://ela.st/slack and join the following channels:

  • security-detection-rules
  • security-eql
1 Like

Hi @siginigin!

Would you be willing to share a few more details about your use case? Specifically:

  • Are the endpoints sending events all generally located in the same time zone?

  • Is the intent of using a time range of 22:00 to 5:00 to detect admin activity when it occurs with respect to a local user's time, (which may vary across time zones), or does that time period, for example, describe a time where centrally-located admins are not expected the be performing administrative work?

  • Is endpoint data collected via Elastic Agent, or Beats?

  • Are your events populated with the event.ingested field?

I discussed your post with the team, and the general consensus is adding a custom boolean off_hours field at ingest time, based on the value of the event.ingested field, would be ideal. (The query in the detection rule would, in-turn, for example, look something like:)

user.group.name :*admin* and off_hours: true

Determining whether or not an event is, or is not "off hours"' depends in part on whether or not "off hours" is timezone-specific, or universal. Your answers to the questions above will help define the solution.

Hi Andrew,

regarding your questions:

  • all logsources are now in same timezone, but after covid it is very like that some users will travel around the world.
  • the time 22-5 is not exact, I just wanted to shoot "some" time to see if there will be false positives, maybe I will adjust it later. All the admins are in same city and TZ now - they are lazy and don't travel. I use event arrival time as the timestamp, no buffering on the path now (but will be soon, and then I'll use original event time from source with respect to TZ).
  • there are auditbeats, filebeats, winlogbeats and rsyslogs as logshippers.
  • not now, but I will populate it.

To clarify the setup - all the logshippers push events to central logstash and after processing they are stored in elasticsearch. I don't really use elastic ingest pipelines now (although *beats data flows through them later), because the parsing, normalization and enrichment is done on logstash. One of it's rules is enrichment with domain groups, if user.name if is defined.

Regarding proposed solution - I 'm sure it will work. But I'm not sure if it is ok to join enriched events with user defined timezone - I think this is more rule engine stuff and this should be separated - I mean to not store this in the event. Moreover, events are in datastream which cant be changed, only reprocessed to new index, if I change the range, so I'd rather define it in rule.
Can be the EQL range with gte and lte used to achive this?

Please share your opinion on this.

Thank you.

Hi @siginigin,

I shared your reply with the team, and they offered the following:

As an alternative to creating an off_hours field by enriching the data at ingest time (via an ingest pipeline, or in your case, Logstash), you may consider experimenting with defining off_hours as a Runtime field.

The following description of runtime fields is from the Runtime fields: Schema on read for Elastic blog post:

Runtime fields enable you to create and query fields that are evaluated only at query time. Instead of indexing all the fields in your data as it’s ingested, you can pick and choose which fields are indexed and which ones are calculated only at runtime as you execute your queries. Runtime fields support new use cases and you won’t have to reindex any of your data.

Exploring a solution based on runtime fields

Note: Runtime fields are beta functionality and subject to change. The details below were shared by a colleague who briefly experimented with runtime fields based on your use case. The information below does not describe a complete solution.

The following POST creates a runtime field a runtime field named hour_of_day, which extracts just the hour portion from the @timestamp field:

POST test-index/_mapping
{
  "runtime": {
    "hour_of_day": {
      "type": "long",
      "script": {
        "source": """
        emit(doc['@timestamp'].value.hourOfDay)
        """
      }
    }
  }
}

The goal of creating a runtime field like hour_of_day in the example above, would be to refer to hour_of_day in a detection rule's KQL query, e.g. hour_of_day >= 22 OR hour_of_day <=5

However, a caveat to the script above is when a date field is accessed via the doc collection:

emit(doc['@timestamp'].value.hourOfDay)

the timezone information is lost, and the hourOfDay is always reported in UTC. This happens because doc is using the indexed field, which under-the-hood is really just a number representing milliseconds since the epoch.

My colleague found that when fields are accessed via params._source instead of doc, per the example below, it's possible to get the original string value from the event, and extract the timezone information from it:

"hour_of_day": {
      "type": "long",
      "script": {
        "source": """
        ZonedDateTime zdt = ZonedDateTime.parse(params._source['mytimestamp']);
        emit(zdt.getHour());
        """
      }
    },

The example above doesn't implement error handling for the case where timestamps can't be parsed. Without error handling, a single document containing a non-parsable timestamps could cause the entire query to fail.

Thus, neither of the examples above are a complete implementation of hour_of_day as a runtime field, but they can serve as a starting point for experimentation.

Another colleague noted that even if hour_of_day is implemented as a runtime field that reliably works across time zones so it can be referred to in KQL, the definition of "normal working hours" can still vary widely, even for users in the same time zone. It may be possible to pivot to a machine-learning-driven approach to this problem, where ML determines what "off hours" is based on the behavior on an individual user over time. The folks here are best-positioned to answer questions about an ML-driven approach, if that's an alternative you're willing to explore at this time.

Update: Runtime fields (schema on read) is now GA in the 7.12 release

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.