Understanding Joint Probability of Less Likely Samples :: calculate method

ProfSimoes · November 24, 2020, 7:15pm

I bought great Collier's book, I've read the Prelert paper about APM monitoring, and then I've been trough the ML-CPP code at github...

A tiny sense of curiosity sparkled inside me about the method:

ProbabilityAggregators::CJointProbabilityOfLessLikelySamples::calculate

is it correct to say that this method answer the (eg) question "If I have 3 samples with value 0,4; 0,2 and 0,3; what's the probability of seeing a set of n samples with a lower joint probability than that of this joint 3 samples?"
Am I getting this the right way?

I'm confused because the github code states in its description the following:

We want to find the probability of seeing a more extreme collection
of independent samples {y(1), y(2), ... , y(n)} than a specified (1)
collection {x(1), x(2), ... , x(n)}, where this is defined as the
probability of the set R:
{ {y(i)} : Product_j{ L(y(j)) } <= Product_j{ L(y(j)) } }. (2)

We will assume that y(i) ~ N(m(i), v(i)). In this case, it is possible
to show that:
P(R) = gi(n/2, Sum_i{ z(i)^2 } / 2) / g(n/2) (3)

where,
z(i) = (x(i) - m(i)) / v(i) ^ (1/2).
gi(., .) is the upper incomplete gamma function.
g(.) is the gamma function.

(1) Looking at the unit test, it seems this probability is then multiplied by the sample size (n==20000) in order to get the expected count of samples with lower probability than the (n==3) set in the case tested, but according to description it should be n==3, as {y(1), y(2), y(3)}... what am I getting wrong in this?

(2) Left and Right hands seems equal to me, is that ok?

(3) Is it possible to get some info about the theorem that probes this?

I would be very happy if you could lend me a hand with this stat choke that's cranking up in my brain...

Tom_Veasey · November 24, 2020, 8:22pm

This is a typo in the github comment: it should read { y(i) : \prod_{i}{ L(y(i)) } <= \prod_{i}{ L(x(i)) } }, i.e. the set of all points for which the joint likelihood is less than the joint likelihood of the values {x(i)} just observed. If the likelihoods are normal then this is just the integral of the multivariate normal distribution outside the ellipsoid, i.e. for which (y - m)^t \Sigma^{-1} (y - m) >= (x - m)^t \Sigma^{-1} (x - m). (In fact, even if they are non-normal, but the tails aren't too heavy, this often gives something remarkably close to the true value as per the definition.) The fact that this is expressible as an incomplete gamma function is a standard result, for example see here and particularly this reference. The equations in that reference aren't numbered so I can't link the exact one, but the relevant one is towards the end, i.e. F(r) = \int_0^{r^2} x^{(n-2)/2}e^{-x/2}/(2^{n/2}\Gamma(n/2)) dx and compare with the definition of the gamma function. To get a sense of what this is doing, have a look at this test which compares this to the value calculated by sampling.

ProfSimoes · November 24, 2020, 8:34pm

Tom & Rich, many thanks for the fast, kind & comprehensive reply! My stat choke on my brain passed all the way out...

system · December 22, 2020, 8:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compute error Elasticsearch elastic-stack-machine-learning	3	502	February 13, 2019
Upper and lower bounds Elasticsearch elastic-stack-machine-learning	3	326	May 6, 2024
Math Calculation in Elastic Watcher not working Elasticsearch elastic-stack-alerting , painless	3	156	May 9, 2024
Watcherで各bucketのdoc_countを集計したい日本語による質問・議論はこちら painless	15	1302	June 30, 2022
Accuracy on cardinality aggregate Elasticsearch	9	2325	July 6, 2017

Understanding Joint Probability of Less Likely Samples :: calculate method

Related topics