We are reassuring the number of occurrence of these event.
All the event are important
But the events are not important if there is specific event that occurs on the middle of a list of events:
For example :
the number of event is important if i have the following sequence : X X X X Y Y X X X X Y X X
the number of event is not important if i have the following sequence : X X X X Y Y X X X X Y X Z X
Where X Y and Z are my events name.
The second sequence is not important since that i have and event named Z received in the sequence.
Is it possible to change the score of the ML on this case ? so that we avoid alerting on this case.
At what frequency do these events occur? Do they all happen within the same minute? hour? day? Or an unknown, arbitrary time? This will be important, I think because to assess whether or not some kind of event "is in the middle of others", then you'll need to wait until the subsequent events appear or not. How long you need to wait to determine this situation will be important. At this point, I'm not even entirely convinced this is an ML problem - perhaps it can just be solved with a search and a conditional.
@AmS - I'm still unconvinced this is a use-case best solved by ML. It seems to me that you can solve this use case by using the sequence search of EQL. (EQL is available v7.9+)
Something like:
GET /events/_eql/search
{
"query": """
sequence
[ myevent where event.value == "X" ]
[ myevent where event.value == "Y" ]
[ myevent where event.value == "Z" ]
"""
}
If the events were
X X X X Y Y X X X X Y X X - the above query would return nothing
If the events were
X X X X Y Y X X X X Y X Z X - the above query would return the sequence
Then, within an alert (Watch) you could inspect the output of the EQL query and alert accordingly
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.