Kibana ML anomalies detection and regression

Hi,

I have trend data and i want to identify customers whose turnover climb or fall .
Witch Kibana ML model can i do to identify them?
Anomalies detection or regression ?
I also want to know how to interpret the results of anomalies detection based on poplulation metric and how to interpret the result of a regression?
For regression per example, i have done an analysis witch predict the turnover. But i dont know what to do with the result?
How can i interpret that( i have training r2 =0,7 and testing r2=0,4).
when can i say my model is good?
How the regression analysis in kibana works? I have used a training percent of 90.
This normaly means that 90% of data is used for training and 10% for testing.
is the 10% a part of my data or is this unseen data that kibana will generate?

Thanks for answers.

Cordialy

If you have a time-series based trend-line of the rates of customer turnovers, you can certainly use anomaly detection to assess if the current rate is higher/lower than typical.

If you are trying to assess/predict whether or not a specific customer is likely to turnover or not (based upon the values of other fields that could be indicative), then a classification analytics job would be the right approach. See a good example of that here: https://www.elastic.co/webinars/introduction-to-supervised-machine-learning-in-elastic

Thanks Richcollier.. So can you explain me please how the typical value is calculated?? I have some values that i don't understand.

The actual value is from the raw data itself (the data ML is analyzing). The typical value is the highest probable value from the internal statistical model that ML has constructed for that data set.

Perhaps a basic introductory video on ML's anomaly detection would be helpful.

Okay thank you.. But what are exactly the statistical models that are used to calculate typical values?

We designed them. You can get a sense of the foundation by looking at this whitepaper:

http://www.ijmlc.org/papers/398-LC018.pdf

Or probably easier is to watch this:

https://www.elastic.co/elasticon/conf/2017/sf/machine-learning-and-statistical-methods-for-time-series-analysis

Hi Richcollier, I have another question..my typical value are like this.see screenshot.
I think this is not normal..what do you think??

I'm not sure I understand why you think those values look incorrect. Can you provide more context and/or the reason why you think that?

I think that because typical values are supposed to be the "highest probable values (p-value)" so it should be a value between 0 and 1. is it correct??
but in my screenshot I have values ​​greater than 1.
on the other hand the p value is not smaller than 0.05 does that mean that the result is not significant?

Cordialy,

The typical value is the highest probable value of the measurement, not the highest probability. As an analogy, the highest probable value of a two dice being rolled is "7"

1 Like

Okay I understand better.. But i still dont understand my result.
I have got this result after the analysis

As you see there is a big gap beetween the actual and the typycal values..

ignore my last posts @richcollier i understood.
Thank you very much

1 Like