Exact match on log message field

Hi, I am evaluating the Alerts feature in the latest Kibana version. We are collecting logs from our server and I want to set up exact-match alerting rules. When creating an alert and setting up a condition on the message field, the options I have is to match or match phrase (and their negations). However, through some testing, it seems that the match is done pretty loosely, as many entries are matched.

The way I have worked around this is to match against a single unique word (an error code), but this won't be applicable to all the alerting rules we will want to set up.

How can I do an exact-match against a string field in the alert? There is nothing custom about the log indices, just whatever came preconfigured with filebeat.

Thanks, Martin

Hi Martin :wave:

Are you asking in general or specifically for the Log Threshold rule type?

If you're asking in general, then I'd expect the ES Query rule type to achieve just that when using a standard term query.
Perhaps if you post an example of the query and what data you don't expect it to match, we can help you figure out why it isn't working as expected.

If on the other hand, you're referring to the Log Threshold rule type, then it would help if you could show us the exact configuration you've used as I think it should work the way you've asked for it to work... but I might be wrong. :thinking:

Thanks

Sorry, wasn't too specific in my question. Indeed it's the second case - the Log Threshold rule type.

The configuration I am using, as given by the API, is:

{
  "id": "050370c0-7932-11ec-9178-69c31341d44d",
  "consumer": "logs",
  "tags": [],
  "name": "MS Email blocked",
  "enabled": true,
  "throttle": null,
  "schedule": {
    "interval": "10m"
  },
  "params": {
    "timeSize": 10,
    "timeUnit": "m",
    "count": {
      "value": 1,
      "comparator": "more than or equals"
    },
    "criteria": [
      {
        "comparator": "matches",
        "field": "message",
        "value": "S3150"
      }
    ],
    "groupBy": [
      "host.name"
    ]
  },
  "rule_type_id": "logs.alert.document.count",
  "created_by": "elastic",
  "updated_by": "elastic",
  "created_at": "2022-01-19T14:13:48.401Z",
  "updated_at": "2022-01-24T12:38:11.422Z",
  "api_key_owner": "elastic",
  "notify_when": "onActiveAlert",
  "mute_all": false,
  "muted_alert_ids": [],
  "scheduled_task_id": "4cbe8d20-79e9-11ec-9178-69c31341d44d",
  "execution_status": {
    "status": "ok",
    "last_execution_date": "2022-01-25T07:53:59.784Z",
    "last_duration": 345
  },
  "actions": [
    {
      "group": "logs.threshold.fired",
      "id": "eedd6d50-7931-11ec-9178-69c31341d44d",
      "params": {
        "documents": [
          {
            "matching_documents": "{{context.matchingDocuments}}",
            "rule_name": "{{rule.name}}",
            "rule_id": "{{rule.id}}",
            "conditions": "{{context.conditions}}",
            "@timestamp": "{{context.timestamp}}",
            "alert_id": "{{alert.id}}"
          }
        ]
      },
      "connector_type_id": ".index"
    }
  ]
}

In the screenshots below you can see that when I change the matching text query from the single-word to a more specific one I actually get more results. The longer string contains the original word, plus some other strings, so it follows that the query should match either the same number or fewer messages, never more (if it was an exact match).


(I manually added a matching string to the logs of one of my servers to verify the matching works OK - so there is exactly one result).

Let me know if there is more information you'd need! Thanks

All good! That's why I asked :slight_smile:

I've pinged the team that owns the Log Threshold rule type, and asked them to weigh in here.
Once someone on that team wakes up I'm sure they'll jump on here (Elastic is distributed across multiple time zones, so this can vary a lot here).

Hi @melkamar,

as with most Elasticsearch queries the exact behavior depends on the field type. In this case I assume the message field is of type text, which means its content is analyzed. The "MATCHES" query operator in your rule is directly translated (see log_threshold_executor.ts#L699-L705) into a matches query with all its fuzzyness, leniency and expansion. The default operator that match uses to combine the terms is OR, so adding additional words can indeed lead to more documents being found. You can influence the operator by adding quotes around the sequence of tokens. Alternatively, the "MATCHES PHRASE" operator is translated into a match_phrase, which might suit you use-case better.

Let us know if that helps.

Thanks for the quick reply.

Indeed the message field is of type text, so your explanation is spot on. I am not very well-versed with Elastic and I thought that "match phrase" would be more fuzzy than plain "match" for some reason, and early in my testing "match phrase" was behaving weirdly so I did not retry that option.

But now it behaves exactly as I need it to, so that solves my question. Cheers!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.