How to perform anomaly analysis by having a single metric as a benchmark

anomalyml · November 2, 2017, 6:25am

Hi,

I have an excel sheet data with below column names:

<Metric_1> <Metric_2> <Server_Name>

I create a mapping in Elasticsearch database as below:

Date: 'Date' datatype
Metric_1: 'Double' datatype
Metric_2: 'Long' datatype
Server_Name: 'Keyword'

After I upload the data into Elasticsearch database , I create a multi metric job analysis. The 'Split Data' and 'Key Fields' will reflect the value provided in the 'Server_Name' column in excel. I can group the values based on the server name and identify the performance of each server. Here, I have an option to compare all the metrics of one server at a single instant to find which metric is behaving in an anomalous way.

Now, I have the below requirement.

I want to have a single metric as benchmark value, calculate anomaly values in it and based on it check for corresponding anomalies in other metrics. This will help me to gain a better root cause analysis model.

I tried to do this as below:

I modified the excel sheet into below format:

<Metric_Name>

Its corresponding mappings are:

Date: 'Date' datatype
Value: 'Double' datatype
Metric_Name: 'Keyword'

The problem with this approach is that I have to prepare the excel sheet in this format by sequentially copy pasting all the metric values in a single column with same date values repeated.

Upon feeding this value into database I was able to see the individual metrics displayed as top influencers. I believe this method is not the right approach.

Could you provide me with a suitable solution on how I can perform anomaly analysis having a single metrics as the benchmark?

Thanks.

richcollier · November 2, 2017, 7:13pm

Using a single metric's behavior as the "model" for which others are judged against is not possible in X-Pack ML. Alternatively, you can compare the behavior of entities against each other using Population Analysis.

anomalyml · November 3, 2017, 6:45am

Hi,

This answers my doubt.

For performing a Population Analysis, we need to have a single homogenous data which can be compared with all similar data from say various users or servers at that same time instant to determine which one is behaving in an anomalous way. Therefore, I believe we cannot compare one metric with multiple other different metrics through 'Population Analysis' option to find the correlation between them. Please correct me if I'm wrong.

Thanks.

richcollier · November 6, 2017, 2:56pm

Correct - there is no way to define a single metric as a "gold standard" against which others will be compared.

system · December 4, 2017, 2:56pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ML anomaly detection question Kibana elastic-stack-machine-learning	8	622	February 11, 2020
Variation in data read by machine learning module to the actual data present in CSV file Elasticsearch	20	1213	December 5, 2017
Simple Anomaly Detection Question Elasticsearch elastic-stack-machine-learning	2	271	August 26, 2023
Troubleshooting with machine learning Elasticsearch elastic-stack-machine-learning	9	2074	August 30, 2017
No Single metric viewer is being generated when I am creating jobs using advanced machine learning Elasticsearch elastic-stack-machine-learning	4	531	August 4, 2019

How to perform anomaly analysis by having a single metric as a benchmark

Related topics