Hi.
We're storing security data in ES. We are not using the beats, but we are formatting data according to ECS. Unfortunately, the different values for fields are so far not well standardized, and working out how to get everything to play nicely together is taking us quite a lot of time.
One of the numerous issues we're working on is getting our authentications recognized by the SIEM app.
This is the request that is used by the SIEM app:
{
"aggs": {
"authentication_success": {
"filter": {
"term": {
"event.type": "authentication_success"
}
}
},
"authentication_success_histogram": {
"auto_date_histogram": {
"field": "@timestamp",
"buckets": "6"
},
"aggs": {
"count": {
"filter": {
"term": {
"event.type": "authentication_success"
}
}
}
}
},
"authentication_failure": {
"filter": {
"term": {
"event.type": "authentication_failure"
}
}
},
"authentication_failure_histogram": {
"auto_date_histogram": {
"field": "@timestamp",
"buckets": "6"
},
"aggs": {
"count": {
"filter": {
"term": {
"event.type": "authentication_failure"
}
}
}
}
}
},
"query": {
"bool": {
"filter": [
{
"bool": {
"filter": [
{
"term": {
"event.category": "authentication"
}
}
]
}
},
{
"range": {
"@timestamp": {
"gte": 1575154800000,
"lte": 1575630751734
}
}
}
]
}
},
"size": 0,
"track_total_hits": false
}
As we can see, the app is getting data based on
"event.category": "authentication"
Which is expected. However the aggregations are made based on the value of event.type
. The ECS documentation on event.type
states the following (https://www.elastic.co/guide/en/ecs/current/ecs-event.html) :
Reserved for future usage.
Please avoid using this field for user data.
As such, I feel like using "event.category": "authentication"
and then aggregating on event.outcome
would make a lot more sense.
Another great example is the "rare process" query. Instead on filtering on event.action: "process_started"
, and having all indexers standardize on this, we build up this huge query that handles all the possible specific cases:
-
"event.action": "executed"
(auditbeat/auditd) -
"event.action": "process_started"
(auditbeat/system) -
"event.code": "4688"
(winlog) -
"winlog.event_id": 1
(sysmon) -
"event.type": "process_start"
(generic, but using a reserved and undocumented field).
There is simply no ECS-compliant way I can have my data appear in the dashboard if it was not generated by one of the official beats. I either have to lie about where my data comes from, or populated a field I'm explicitely told not to use.
Basically, I feel like the right way to go about designing most of those requests would be to standardize the values for some of the fields (event.category
, event.action
, event.outcome
, ...), and then filter only based on that. The beats (or ingress pipelines, or any other system indexing the data) would then be responsible for ensuring the data gets formated properly.
I'm fine with standardizing on event.type
, but the acceptable uses and values of the field really need to be published.
By extension, filtering on stuff like event.agent
, event.module
or event.dataset
seems like bad form to me. The frontend should not have to care where the data comes from. Only that it complies with ECS.