How to use machine learning settings=>( filter section ) to filter out errors in logs which can be used for multiple job ids

machine-learning

(sagar) #1

I wanted to create a filter which can filter errors from logs across all job ids in machine learning section. Kindly, help me in that asap.
I'm attaching screenshot of what i was trying to do


(rich collier) #2

The better thing to do here is to create a Saved Search in Kibana's Discover view, using a query filter:

(here the filter is !ERROR - meaning logs that dont contain the word ERROR)

Save this search (using Save button at the top). Name it something like "Exclude Errors"

Then, create an ML job, but use this newly named Saved Search as the basis for the ML job:

(click on the "Exclude Errors" named saved search on the right)

Now, proceed with the rest of the ML job config! You'll only pass data that isn't errors to the ML job!


(sagar) #3

i tried as you said but i'm getting all the data in single metric job. I just want to get error count from logs on particular date for which i am trying to use filter . Also i want the same filter to be applied to other jobs in machine learning section. Please provide suggestion on it asap.
I'm here attaching the screenshot



Now i created a single metric job but i am not getting that filter data . what is possible way to get all error counts in it using filter


(rich collier) #4

By the way, I thought you meant "filter out the errors", but it sounds like you only want to count the errors - my apologies for not understanding you.

So, if you do want to only see the errors, then you've done things correctly by making that Saved Search and creating the ML job in the way that you did.

Why do you think the ML job is seeing more than just the errors? Your screenshots look sensible to me....

Run your ML job configuration as is - then find and anomaly. Compare the actual value shown in the anomaly with the value of the # of errors seen by your saved search at that same time. They should be the same. If they are not, please post screenshots showing that they are different.


(sagar) #5

Sir the actual value coming in the single metric job is very different from the saved search include error . I did count for 24th jan in my metric job but its using all data of that date in ml job but not filtering error as in include error (saved search). I'm attaching screenshot of it.



You can see the counts are differently completely for ML JOB AND SAVED SEARCH( include error)

I really dont have any idea why it is showing like that. Please help on this and if any other way to show real time error log metric job is possible then do suggest sir.


(sagar) #6

sir, can you please help me on this issue.


(rich collier) #7

So I can understand your setup better - Please paste three things from the ML API - https://www.elastic.co/guide/en/x-pack/current/ml-api-quickref.html:

  1. The job details:

GET _xpack/ml/anomaly_detectors/<job_id>

  1. The datafeed details:

GET _xpack/ml/datafeeds/<feed_id>

  1. The datafeed preview:

GET _xpack/ml/datafeeds/<datafeed_id>/_preview


(sagar) #8

for

  1. GET _xpack/ml/anomaly_detectors/fzr;


  2. GET _xpack/ml/datafeeds/datafeed-fzr


  3. GET _xpack/ml/datafeeds/datafeed-fzr/_preview

The newtimestamp is the date field i created in logstash to get actual log error dates.
still that saved search created job id of single metric job is missing error count on few days . If there is any other alternative then do suggest.

Here on 2019-02-18 , i have error count but not shown in ml job above.

Reply as soon as possible sir.


(rich collier) #9

My feeling is that your filtered search isn't really filtering. Judging by the following:

It seems like every interval, you are getting many errors (820, 512, 2, and so on) during those timeframes. You need to make sure you're getting the right values here - as this is what is getting passed to ML.

I'm guessing your query section of the ML datafeed is not doing the expected filtering correctly. I don't know for certain, but it looks like there may be an extraneous OR operator:

v2

When I create a filtered search and save it as a saved search, here is what I get for the actual query:

  "datafeed_config": {
    "query": {
      "bool": {
        "must": [
          {
            "query_string": {
              "query": "error",
              "analyze_wildcard": true,
              "default_field": "*"
            }
          }
        ],
        "filter": [],
        "should": [],
        "must_not": []
      }
    },
    "indices": [
      "logstash-*"
    ],
    "types": []
  }

Perhaps change your datafeed's query to look more like mine and see if you get better results.