Hello!
I am new to machine learning in elastic products, it was released here recently and we are on version 5.4 .
I want to create a machine learning job, but I am not sure on how to do it.
Sample data (obfuscated):
Let's say we have data coming in (filebeat-->logstash-->elastic search), field number one is - person_id ( with this field we can identify the unique person), every person either eats an apple or an orange (field - "food")(apple means good and orange means bad). I need to see if there are irregularities between the two (either check/look at both, or just count "orange", though eating too many apples may also prove to be important to know/bad) .
I want to create a machine learning job that would find anomalies in the top 10 most frequent/common persons eating apples or oranges.
(The top 10 is for eg.: Jimmy ate an apple 10 times (he has 10 data points) and Alex has 9 data points and the others have 3 - 5 points, so Jimmy and Alex is now Top 2).
The issue: I can only see count "events" and "offset" in "fields" and not any specific fields, I can select them in "Key fields" but that doesn't seem to affect/do anything (or can I do that after creating the job?).
I can choose to split data (this separates the persons individually, which is good since I want to track their activity individually and not as a group, but there is more than ten of them (probably more, since I can only see ten in the preview)
Do I create 10 different jobs and define a specific "person_id" or can it be done in a single job? Can I seperate the persons and have machine learning look at them either having field "food = apple" or field "food = orange".
An ideal way would be to have an event count for the 10 persons, and set the "food" field as the influencer. One main issue is how do I split data, while only keeping the top 10 persons?
Right now, I am going through the advanced editor, since the basic one "seems" to not be applicable to this scenario... perhaps I may be wrong though.
Related a bit: Related topic I found
Thank you!