ML jobs

hello,

i have created the built-in machine learning jobs for windows and linux,
for linux it s working as expected, for windows some of the jobs are showing 0 Processed data.
one of the jobs is "windows_anomalous_process_creation"

i ran the below :
GET /_ml/datafeeds/datafeed-windows_anomalous_process_creation/_preview
and the output is

'[
{
"@timestamp" : 1580212136848,
"host.name" : "DESKTOP-TEST123",
"process.name" : "GoogleUpdate.exe",
"process.parent.name" : "svchost.exe",
"user.name" : "SYSTEM"
},
{
"@timestamp" : 1580213339330,
"host.name" : "DESKTOP-TEST123",
"process.name" : "UsoClient.exe",
"process.parent.name" : "svchost.exe",
"user.name" : "SYSTEM"
}
]'

Since the datafeed preview looks okay, it seems like the problem is probably with the fact that your windows logs might be ingested more slowly than your linux logs, thus the real-time nature of the ML job is undermined.

You could test this theory by doing the following:

  1. Go to the ML Anomaly Detection Jobs page.
  2. Clone the windows_anomalous_process_creation job
  3. Have the job run on some past data.
  4. Continue to run the data in "real-time"
  5. If the number of processed records doesn't increase after entering real-time mode (and assuming you are still ingesting new Windows logs) - then you know your ingest delay is bigger than what the ML job is accounting for - so you will need to increase the query_delay parameter of the ML job (or figure out why the ingest delay is larger on Windows than Linux)

thank you, it solved the issue

1 Like

hi,
I created Anomaly Detection job based on live index, the data is created according to the time the job is created but i dont see the data is getting updated during the time , why?

The obvious question to ask here is: "Is your datafeed running?"

only when I chose advance job iadd the option to choose empty end time (real time job)
and now i indeed see that job is updating every 15 min
i added a watch to send mail when there is warning anomally
but although i dont see any anommaly warning lately in the job i got a mail
Job : 5l
Time : 2020-10-04T13:30:00.000Z
Anomaly score : 0

any idea why?

"bucket_results": {
"filter": {
"range": {
"anomaly_score": {
"gte": 0
}
}
},

hi
any idea why i see:
No influencers found under top influencers
although i added it ?

i have event rate job and i added influencer of customerName.keyword
but when i choose the view by customerName.keyword it sais no influencer

Datafeeds running or not running is independent of the job type (Advanced vs. Single Metric, etc.). All of the wizards for setting up the jobs have an option to continue running the job in real-time. You may have missed seeing the buttons.

Influencers are only shown if an influencer is actually found for an anomaly. Configuring an influencer in the job only declares that this influencer is possible. If no influencer exists for an anomaly it just means that the blame for that anomaly was not dominated by the contributions of a particular entity.

Also, in the future - if you have different questions, please open a new topic thread with that question instead of adding to some other discussion thread. It helps with organization and allows people to find answers to relevant topics in the future.

thanks
in single job i have just the button :"start job running in real time"
(in advanced i can choose this or enddate)
but i noticed that in single job the graph is not progressing in Single Metric Viewer
(just stays on the time the report was generated and not refreshed with new data every 15 min or so, in advanced mode when choosing real time it does)

Let's get our terminology straight here. In the Single Metric Job "Wizard" - there is a button that says "Start job running in real-time"

By clicking this button, you are actually issuing the start command on the job's datafeed. There is nothing on this screen that changes, however.

To see if your job is running in real-time one place to look is on the jobs page - specifically look at the spots circled in red: