According to Machine Learning Lab2 Section 15, the key field
can impact on the final anomaly score like partition field
. I'm confused about this. If key field works like partition field, then why does it exist?
Can any body explain how influencers work behind?
The question and answer is as below:
Create a Multi Metric job that only uses the Count but also sets a Keyfield or airline.keyword. Compare the results of this to the previous Single Metric job you created for the farequote data. Both show a single critical anomaly for the same airline on the same date but they have a different Anomaly Score. Why is this?
Show answer:
The Simgle Metric job created a score of 80 and the Multi Metric job using the Keyfield generated one of 98. This is because with the Multi Metric job you compared the airlines to their own past behavior rather than the entire dataset as a whole. When creating Machine Learning jobs you will have to consider if it makes sense to partition or not. And sometimes you'll want a job that does both!