Authentication fields used by SIEM vs ECS

Hi.
We're storing security data in ES. We are not using the beats, but we are formatting data according to ECS. Unfortunately, the different values for fields are so far not well standardized, and working out how to get everything to play nicely together is taking us quite a lot of time.

One of the numerous issues we're working on is getting our authentications recognized by the SIEM app.

This is the request that is used by the SIEM app:

	{
	  "aggs": {
	    "authentication_success": {
	      "filter": {
		"term": {
		  "event.type": "authentication_success"
		}
	      }
	    },
	    "authentication_success_histogram": {
	      "auto_date_histogram": {
		"field": "@timestamp",
		"buckets": "6"
	      },
	      "aggs": {
		"count": {
		  "filter": {
		    "term": {
		      "event.type": "authentication_success"
		    }
		  }
		}
	      }
	    },
	    "authentication_failure": {
	      "filter": {
		"term": {
		  "event.type": "authentication_failure"
		}
	      }
	    },
	    "authentication_failure_histogram": {
	      "auto_date_histogram": {
		"field": "@timestamp",
		"buckets": "6"
	      },
	      "aggs": {
		"count": {
		  "filter": {
		    "term": {
		      "event.type": "authentication_failure"
		    }
		  }
		}
	      }
	    }
	  },
	  "query": {
	    "bool": {
	      "filter": [
		{
		  "bool": {
		    "filter": [
		      {
		        "term": {
		          "event.category": "authentication"
		        }
		      }
		    ]
		  }
		},
		{
		  "range": {
		    "@timestamp": {
		      "gte": 1575154800000,
		      "lte": 1575630751734
		    }
		  }
		}
	      ]
	    }
	  },
	  "size": 0,
	  "track_total_hits": false
	}

As we can see, the app is getting data based on

 "event.category": "authentication"

Which is expected. However the aggregations are made based on the value of event.type. The ECS documentation on event.type states the following (https://www.elastic.co/guide/en/ecs/current/ecs-event.html) :

Reserved for future usage.
Please avoid using this field for user data.

As such, I feel like using "event.category": "authentication" and then aggregating on event.outcome would make a lot more sense.

Another great example is the "rare process" query. Instead on filtering on event.action: "process_started", and having all indexers standardize on this, we build up this huge query that handles all the possible specific cases:

  • "event.action": "executed" (auditbeat/auditd)
  • "event.action": "process_started" (auditbeat/system)
  • "event.code": "4688" (winlog)
  • "winlog.event_id": 1 (sysmon)
  • "event.type": "process_start" (generic, but using a reserved and undocumented field).

There is simply no ECS-compliant way I can have my data appear in the dashboard if it was not generated by one of the official beats. I either have to lie about where my data comes from, or populated a field I'm explicitely told not to use.

Basically, I feel like the right way to go about designing most of those requests would be to standardize the values for some of the fields (event.category, event.action, event.outcome, ...), and then filter only based on that. The beats (or ingress pipelines, or any other system indexing the data) would then be responsible for ensuring the data gets formated properly.

I'm fine with standardizing on event.type, but the acceptable uses and values of the field really need to be published.

By extension, filtering on stuff like event.agent, event.module or event.dataset seems like bad form to me. The frontend should not have to care where the data comes from. Only that it complies with ECS.

1 Like

Hi @vbr,

Yes, both the ECS team and the SIEM team are aware of this imperfect state of affairs.

The plan is indeed to publish the official values for the reserved fields shortly, and to have SIEM look for those, rather than going by source-specific fields.

For completeness' sake, the 4 reserved fields are event.kind, event.category, event.type and event.outcome.

Once the values for these fields are published, the plan is indeed to:

  • enable users, the SIEM app and third party partners to rely on these fields to query for "auth events", "process activity", "DNS traffic" & whatnot, without having to rely on which source they came from.
  • enable users like you to populate the SIEM via custom pipelines without having to "lie" about the source of the events :slight_smile:

We're trying to deliver these values soon, so stay tuned!

Thanks for your answer.

Is there a draft / work in progress version that we could base our current work on ?

The first few "official" values and a draft for additional values under consideration is coming soon. I can't say more than that for now.

Counted in weeks, not months, let's say

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.