I created some shared exception lists for some known and legit bots used in user agents and associated it with some of my detection rules but when I checked the results I am still seeing the them.
I found the documentation but it is not mentioned whether the shared exception lists are case-sensitive or not?
If they are case sensitive, what would be the best approach to include both lowercase and uppercase?
Example: BingBot, Bingbot, bingbot
Per the documentation, rule exceptions are case-sensitive:
Rule exceptions are case-sensitive, which means that any character that’s entered as an uppercase or lowercase letter will be treated as such.
One of the workarounds is also mentioned there:
In the event you don’t want a field evaluated as case-sensitive, some ECS fields have a .caseless version that you can use.
There can be other workarounds, and the best one would depend on your specific use case, needs, and available resources:
Is the list of known and legit bots used in user agents finite and short? Perhaps you could add them all to an exception. If it's just a few items, you could combine them with OR. For more items, you could create a value list and then use is in list operator in your exception to match against all of them.
If EQL works well for your use cases, you could create EQL rules and exclude the bots directly in the EQL queries. The EQL's : operator is case-insensitive.
If Custom query rules would work for your use cases, then you could create a Kibana saved query and then attach it to your rules as a filter. More details in the docs. Rule filters are flexible and powerful, as you can use Elasticsearch DSL in them.
If ESQL query rules would work for your use cases, then you could probably use the TO_LOWER function.
You could leverage runtime fields to normalize the source field values to lowercase, and then match against the runtime field from your exceptions.
You could make the field case-insensitive at the mappings level, especially if it's your custom index and you control the mappings.
Similarly to the above, you could leverage ingest pipelines and the lowercase processor.
You could lowercase at ingest time / on the data source side.
I'm sure this is not an exhaustive list. Let us know if this helps!
Hello @georgii ,
So far, in the environment I have ".caseless" version only for process.name and process.executable, not for user agent field.
For the case I mentioned, I have a new terms rule that checks for new user agents associated with IP addresses and I added a shared exception list to exclude known bots from user agent field. So, in the list I have ~200 user agents so far.
I can try runtime fields to normalize the source field values to lowercase or leveraging ingest pipelines.
Thank you so much for the recommendations and explanations, I appreciate it!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.