Threshold rules not triggering on selfmade index

Well the good news is we just logged an issue where we are no longer going to allow users to use thresholds on "non-aggregatable" fields so we can give a better UI/UX experience:

So, thank you for the forum posts and looking at things.

In the meantime, before that bug fix goes across I am going to explain a bit about keyword/text fields, aggregatables and mapping conflicts for you and anyone else currently running into this so you know how/why this is the way it is.

If you look at how the threshold rule works, it is using an aggregation with a min document count just like this:

Code for the lines in question:

So when doing dev tooling queries you want to be careful about aggregations vs regular queries as aggregations are the queries that are picky about things being of particular types such as "keyword" or as it's referred to "aggregatable types"

Examples below:

ldap one I have which will blow up in dev tools because user.name is a text field

GET ldap-delme/_search
{
  "size": 0,
  "aggregations": {
    "threshold": {
      "terms": {
        "field": "user.name",
        "min_doc_count": 1
      }
    }
  }
}

Errors you will get back from ES:

"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [user.name] in order to load field data by uninverting the inverted index. Note that this can use significant memory."

auditbeat which has the mapping with "text", "keyword" reversed where user.name is a keyword first, text as nested underneath will work like so because keyword is aggregatable:

GET auditbeat-7.9.1/_search
{
  "size": 0,
  "aggregations": {
    "threshold": {
      "terms": {
        "field": "user.name",
        "min_doc_count": 1
      }
    }
  }
}

If I change out my first ldap index query to use the explicit user.name.keyword then this will now also now work because I'm using the keyword field explicitly:

GET ldap-delme/_search
{
  "size": 0,
  "aggregations": {
    "threshold": {
      "terms": {
        "field": "user.name.keyword",
        "min_doc_count": 1
      }
    }
  }
}

In the ECS docs however, ECS wants the user.name to be keyword first and then anything additional would be .text underneath it:


Which is why I recommend re-indexing and going along with ECS as that would make out of the box rules work and easier to collaborate with other people with rules and content. Also, if you mix together your ldap index mapping and an auditbeat mapping or other ECS mappings you will start to get mapping conflicts showing up and other bad behaviors because you cannot mix together indexes that have different data types. This can lead to more aggregation woa's.

Now, why are you hitting an error with ldap_delme2 though? Well if we look at event.outcome it is of type text which is not an aggregatable.

Using Kibana you can make a kibana index out of ldap_delme2 and it will show you what is and is not aggregatable:

First step is to create a kibana index from Stack Management Index Patterns ->

Then afterwards you can see what is and is not aggegatable and event outcome is not one of them:

What if I tried to mix auditbeat together with my ldap_delme which is your current mapping? Well, we would see Kibana index telling us we now have conflicts around user.name since one is using a text field and the other is keyword.

I try to mix a valid ECS with yours which has a few fields with text where we would expect keyword

Afterwards I see I have a conflict in three areas by selecting data type "conflict":

And I can hover to see what is going on:

2 Likes