How to perform anomaly analysis by having a single metric as a benchmark

Hi,

I have an excel sheet data with below column names:

<Metric_1> <Metric_2> <Server_Name>

I create a mapping in Elasticsearch database as below:

Date: 'Date' datatype
Metric_1: 'Double' datatype
Metric_2: 'Long' datatype
Server_Name: 'Keyword'

After I upload the data into Elasticsearch database , I create a multi metric job analysis. The 'Split Data' and 'Key Fields' will reflect the value provided in the 'Server_Name' column in excel. I can group the values based on the server name and identify the performance of each server. Here, I have an option to compare all the metrics of one server at a single instant to find which metric is behaving in an anomalous way.

Now, I have the below requirement.

I want to have a single metric as benchmark value, calculate anomaly values in it and based on it check for corresponding anomalies in other metrics. This will help me to gain a better root cause analysis model.

I tried to do this as below:

I modified the excel sheet into below format:

<Metric_Name>

Its corresponding mappings are:

Date: 'Date' datatype
Value: 'Double' datatype
Metric_Name: 'Keyword'

The problem with this approach is that I have to prepare the excel sheet in this format by sequentially copy pasting all the metric values in a single column with same date values repeated.

Upon feeding this value into database I was able to see the individual metrics displayed as top influencers. I believe this method is not the right approach.

Could you provide me with a suitable solution on how I can perform anomaly analysis having a single metrics as the benchmark?

Thanks.

Using a single metric's behavior as the "model" for which others are judged against is not possible in X-Pack ML. Alternatively, you can compare the behavior of entities against each other using Population Analysis.

Hi,

This answers my doubt.

For performing a Population Analysis, we need to have a single homogenous data which can be compared with all similar data from say various users or servers at that same time instant to determine which one is behaving in an anomalous way. Therefore, I believe we cannot compare one metric with multiple other different metrics through 'Population Analysis' option to find the correlation between them. Please correct me if I'm wrong.

Thanks.

Correct - there is no way to define a single metric as a "gold standard" against which others will be compared.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.