How to use machine learning settings=>( filter section ) to filter out errors in logs which can be used for multiple job ids

sagarleo · February 14, 2019, 9:32am

I wanted to create a filter which can filter errors from logs across all job ids in machine learning section. Kindly, help me in that asap.
I'm attaching screenshot of what i was trying to do

richcollier · February 15, 2019, 1:50am

The better thing to do here is to create a Saved Search in Kibana's Discover view, using a query filter:

(here the filter is !ERROR - meaning logs that dont contain the word ERROR)

Save this search (using Save button at the top). Name it something like "Exclude Errors"

Then, create an ML job, but use this newly named Saved Search as the basis for the ML job:

(click on the "Exclude Errors" named saved search on the right)

Now, proceed with the rest of the ML job config! You'll only pass data that isn't errors to the ML job!

sagarleo · February 15, 2019, 6:46am

i tried as you said but i'm getting all the data in single metric job. I just want to get error count from logs on particular date for which i am trying to use filter . Also i want the same filter to be applied to other jobs in machine learning section. Please provide suggestion on it asap.
I'm here attaching the screenshot

Now i created a single metric job but i am not getting that filter data . what is possible way to get all error counts in it using filter

richcollier · February 15, 2019, 2:07pm

By the way, I thought you meant "filter out the errors", but it sounds like you only want to count the errors - my apologies for not understanding you.

So, if you do want to only see the errors, then you've done things correctly by making that Saved Search and creating the ML job in the way that you did.

Why do you think the ML job is seeing more than just the errors? Your screenshots look sensible to me....

Run your ML job configuration as is - then find and anomaly. Compare the actual value shown in the anomaly with the value of the # of errors seen by your saved search at that same time. They should be the same. If they are not, please post screenshots showing that they are different.

sagarleo · February 16, 2019, 6:36pm

Sir the actual value coming in the single metric job is very different from the saved search include error . I did count for 24th jan in my metric job but its using all data of that date in ml job but not filtering error as in include error (saved search). I'm attaching screenshot of it.

You can see the counts are differently completely for ML JOB AND SAVED SEARCH( include error)

I really dont have any idea why it is showing like that. Please help on this and if any other way to show real time error log metric job is possible then do suggest sir.

sagarleo · February 19, 2019, 5:50am

sir, can you please help me on this issue.

richcollier · February 19, 2019, 1:45pm

So I can understand your setup better - Please paste three things from the ML API - https://www.elastic.co/guide/en/x-pack/current/ml-api-quickref.html:

The job details:

GET _xpack/ml/anomaly_detectors/<job_id>

The datafeed details:

GET _xpack/ml/datafeeds/<feed_id>

The datafeed preview:

GET _xpack/ml/datafeeds/<datafeed_id>/_preview

sagarleo · February 20, 2019, 7:37am

for

GET _xpack/ml/anomaly_detectors/fzr;

v11313×904 57.8 KB

image1132×902 59 KB
GET _xpack/ml/datafeeds/datafeed-fzr

v21192×906 56.2 KB

v211189×898 58.6 KB
GET _xpack/ml/datafeeds/datafeed-fzr/_preview

v31002×537 25 KB

The newtimestamp is the date field i created in logstash to get actual log error dates.
still that saved search created job id of single metric job is missing error count on few days . If there is any other alternative then do suggest.

Here on 2019-02-18 , i have error count but not shown in ml job above.

Reply as soon as possible sir.

richcollier · February 20, 2019, 2:20pm

My feeling is that your filtered search isn't really filtering. Judging by the following:

It seems like every interval, you are getting many errors (820, 512, 2, and so on) during those timeframes. You need to make sure you're getting the right values here - as this is what is getting passed to ML.

I'm guessing your query section of the ML datafeed is not doing the expected filtering correctly. I don't know for certain, but it looks like there may be an extraneous OR operator:

When I create a filtered search and save it as a saved search, here is what I get for the actual query:

  "datafeed_config": {
    "query": {
      "bool": {
        "must": [
          {
            "query_string": {
              "query": "error",
              "analyze_wildcard": true,
              "default_field": "*"
            }
          }
        ],
        "filter": [],
        "should": [],
        "must_not": []
      }
    },
    "indices": [
      "logstash-*"
    ],
    "types": []
  }

Perhaps change your datafeed's query to look more like mine and see if you get better results.

sagarleo · February 26, 2019, 11:20am

Sorry for late reply sir,
I tried your way and now its working correctly upto some extent. I'm really happy that you helped me out sir. Thank you so much.

system · March 26, 2019, 11:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Question on filtering the documents in ML Data feed Kibana elastic-stack-machine-learning	4	647	December 20, 2019
Creating ML using saved search Kibana elastic-stack-machine-learning	2	710	October 29, 2018
Unspecific Machine Learning Job Validation Errors - can't troubleshoot Elasticsearch elastic-stack-machine-learning	2	180	January 10, 2024
Troubleshooting with machine learning Elasticsearch elastic-stack-machine-learning	9	2077	August 30, 2017
Help with MachineLearning job Kibana elastic-stack-machine-learning	6	411	December 22, 2018

How to use machine learning settings=>( filter section ) to filter out errors in logs which can be used for multiple job ids

Related topics