We're evaluating machine learning to see if it will do the job we want it to, but we are coming against issues when creating Population Jobs. For the population field we can only select keywords. Since our user ids are integers we can't split the population this way. If we store our user id instead as a keyword then we can select this and we get good results.
Is this restricted for cardinality issues, i.e. the same as this issue?
The Population UI doesn't allow it because often, a numerical field chosen as the population would be a misconfiguration (imagine selecting response_time as a population field, for example - it would not be sensible). In other words, it is trying to prevent you from making a mistake.
Now, if you really don't want to modify how that field is stored, you can still create a ML job in the UI - you just need to do it in the Advanced Job wizard (or the API). To make a job a population job, select the field that is the population as the over_field_name (link). The Advanced Job Wizard does not prevent you from selecting numerical fields.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.