Hello there,
We are currently playing with x-pack and mainly with Machine-learning.
Can you, please, show me way how we can create Machine-learning job for our use case?
Our logs contains many fields but there are two important for Machine-learning job: name (string) like AGENT1, AGENT2, .... status (string) like FAIL, SUCCESS, ....
We want monitor status per name, watch if there is too many FAILs in status.
Thomas - so this is likely count to leverage one of the count functions to track the occurrence rate of documents indexed over time. But, two questions:
How many possibilities are there for name (the cardinality)?
How many possibilities are there for status (again, the cardinality)?
Ok great - then an Advanced Job where you select the following in the Detector would accomplish a double-split (as in "count by name and also for every status"):
function: count
by_field_name: name
partition_field_name: status
So, in this way, every unique combination of user and status is modeled independently (for example "AGENT1/FAIL", "AGENT1/SUCCESS", 'AGENT2/FAIL", etc.)
You save my day @richcollier, thank you.
I have additional question about my use case.
Is it possible to monitor critical statuses (like FAIL, ...) when the occurence rate is bigger then usual and in the same time monitor low occurence rate of wanted status (like SUCCESS) ?
Basically I do not want monitor low occurrence FAIL statuses and high occurence for SUCCESS.
Can be this done in one job?
Because the thing you care about is the value of a field (in this case, status) - and since you want to "count" the occurrence rate of them over time (i.e. use one of ML's count functions) - you need to create two separate searches for the raw data and therefore you need to create 2 separate jobs.
job1:
search for only status:FAIL
detector: low_count
job2:
search for only status:SUCCESS
detector: high_count
If there was a situation in which both conditions could be facilitated by one search, then you could have used two detectors in the same job (via the Advanced Job Wizard), but alas, you need two searches here to accomplish what you want (and thus need two jobs).
For each job, just create a Saved Search in Kibana for the condition, and use that Saved Search as the basis for your ML jobs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.