When a signal is created it it contains a series of fields representing what i assume to be information regarding what triggered the rule. In particular is the signal.parent.* fields (also signal.parents.* - not that i understand the difference) which often contains a document id under singal.parent.id. However when the entry is created by a threshold trigger it represents some other value that is not a document id (at least not the default format expected). ex. 00c876b0-bf18-11eb-a6e0-590e938f2721`
Attempts at searching for a document with that id fail and I presume it has something to with the how join queries are done. Though it might be the child parent relationship thing but could not figure out how to use it properly as none of the mappings specific any relationships, including the signals indices.
Does anyone know how to use the information in a threshold signal to retrieve the documents that triggered it?
Thanks for your post. So I'm pulling from a past related question regarding threshold rules where my teammate @madi noted the following about threshold rule alerts:
We did update the functionality in 7.11 so that the fields queried in the original events will NOT be reflected in the signals. This was because the fields are not necessarily the same value across all matches, so it was ambiguous (wildcards can occur in the queries, for example)... that functionality is now provided by the timeline (when you click 'investigate in timeline', the original events are pulled back and you can see everything that matched) [...]
The Timeline functionality for threshold rules is a little unreliable currently, but will be tightened up in the upcoming 7.12 release. You should be able to visualize all the events that made up the signal in Timeline out of the box [...]
Essentially, you should be able to view each individual event relating to that threshold alert when you pull it into your Timeline. Let us know if that helps!
Timeline is nice and all but how is this done outside of the detection's timeline? If I'm trying to integrate other tools to inquire about the signal via RestAPI, like say a SOAR platform or a python script, how do we get that relationship?
From what i can thus far Threshold signal events do not store any relevant information capable of identifying the exact events used to trigger the signal. Timelines creates a query (somehow) using the signals build information to attempt at find relevant events. Silly that it does not store the document id's at all. This effectively makes it very difficult for external tools to operate the Detection's module. Though i suppose it wont much longer with the removal of direct access to system indices , including any ability to read the data .
Timeline is nice and all but how is this done outside of the detection's timeline? If I'm trying to integrate other tools to inquire about the signal via RestAPI, like say a SOAR platform or a python script, how do we get that relationship?
Today, your scripts/playbooks/runbooks need to manually reconstruct the timeline search criteria to find the relevant source events. The templated timeline investigation is a key step in the investigative workflow, and the search criteria is specified during the "investigate in timeline" interaction only.
It sounds like you would like to have either all the source events/ID's (although that could make for a REALLY large alert) or a something like a "correlation search" available somewhere in the signal that you could extract and execute as part of your scripts/playbooks/runbooks. Am I understanding that correctly?
We are not planning to remove/deny access to the alerts indices. Instead, we are planning to change them to "hidden" indices that will still be accessible.
Timeline investigations don't always present an accurate representation of what triggered the signal/alert, especially if any manipulation of the source index occurs - such as latent/batched data. With automation and expansive multilayered environments traceability has high importance given that ES SIEM might not be the central system (or even used for investigation). While adding the document ids could create large events that's mostly dependent on how the rules are setup. Consequential it could have the reverse effect making events smaller depending on how much data no longer needs to be appended to the signal/alert if direct references could be made.
Ultimately yes, some way of appending the base ids that triggered the signal/alert - perhaps not by default but a configurable setting per rule. Not sure what "correlation search" would be, but yes, even if it was just some way to present the "timeline" query that could be used directly against Elasticsearch would be better than having to manually sort out how to find the events - and then subsequently question oneself if it was done correctly (at least from an upstream perspective)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.