Filtering Index or custom rules for Elastic ML anomaly detection

elizobee · June 8, 2021, 4:59pm

I would like to run anomaly detection on a subset of an index. It looks like I am supposed to set up a filter list containing the values I wish to select, and then create a custom rule to associate a field with a filter list and an action. I am supposed to edit the JSON directly in the ML job to do this. I have multiple fields and filter lists to apply to my index.

I am using V7.12.1. I am looking for examples of what the JSON should look like, info on what works for this version, as the only examples I see are for future versions, and some advice as to how I can learn to write JSON well enough to apply complex SQL "where" conditions to indexes.

It seems like it would be a common task to need to apply a "where" condition to an index for anomaly detection. Are there any plans to make a GUI interface for this?

richcollier · June 8, 2021, 5:49pm

Filter the data first using the Discover app in Kibana

image2402×1246 317 KB
Save the search as a named "Saved Search"

image1290×836 102 KB
Use the Saved Search as the basis of your Anomaly Detection job.

image1674×1062 87.7 KB

All can be done via the UI

elizobee · June 8, 2021, 6:33pm

Thank you, very helpful! Will be testing shortly.

elizobee · June 8, 2021, 10:05pm

This works. But what if I would like to also do an aggregation in order to speed up the processing of the anomaly detection. In your example above, what if I wanted to create an aggregation of the count of records (or the sum of some other numeric field) with the 404 keyword, per day, and feed that into the anomaly detection instead of each of the individual records from the index? I do not have permissions in PRD to create a new index so it would have to be a saved search, editing JSON, using a GUI, or something like that.

richcollier · June 9, 2021, 12:01am

elizobee · June 11, 2021, 4:22pm

The aggregation does work, I wanted to put a few tips here for how I modified the JSON, using V 7.12.1.

First, if you are using the anomaly detection single-metric wizard, the data will be aggregated for you so you do not need to manually modify the job to achieve aggregation of the data feed.

If you switch to the multi-metric wizard, the data will not be aggregated. But, multi-metric is the wizard that allows you to add influencers, this is not available in the single-metric wizard. So you may need to use multi-metric with only one metric if you want to track influencers.

After setting up your multi-metric job if you click on "Convert to advanced job" and select next, you will see a choice for Summary Count Field - select the field that contains your document count. Then select "Edit JSON". You need to edit both the Job Configuration JSON and the DataFeed JSON.

In the Job Configuration JSON you need to add: "by_field_name": "myfieldname" in the detectors section, if you are using a "by" field.

In the DataFeed JSON you need to add the aggregations section of the JSON following the pattern in the article linked above.

The editor will help align your brackets and you can refresh the resulting datafeed to see the aggregation level change.

Using this method you may aggregate your data feed. However, if you desire to use text fields that are ineligible for aggregation, additional changes to your index will be required to add field.keyword to the index for your "by" fields in order to proceed.

system · July 9, 2021, 4:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible use anomaly detection as query? Elasticsearch elastic-stack-machine-learning	2	407	March 17, 2020
Composite query and filter agregation on a .ml-anomalies index Elasticsearch	7	492	October 9, 2020
Anomaly detection on ratio of two counts Elasticsearch elastic-stack-machine-learning	5	795	March 26, 2020
Adding custom rules for anomaly detection in X-pack Elasticsearch elastic-stack-machine-learning	4	628	July 11, 2019
How to index anomalous documents in another index? Elasticsearch elastic-stack-machine-learning	3	503	February 27, 2022

Filtering Index or custom rules for Elastic ML anomaly detection

Related topics