Can not split by number fields in multi metric machine learning jobs

Hello. I have web server logs which contains numeric field response_code. But I can not use this field in "Split Data" at machine learning multi metric jobs. In fact no numeric fields are showed here. Is this by design?


When creating a multi metric job numeric fields cannot be used as the split field. This is due to the high cardinality the numeric types meaning the split could result in literally billions of individual metrics.

For multi metric jobs you can split by keyword fields or IP addresses. Do you have the response code or a HTTP status as a keyword field? If so you can split by that.


Field response_code data type is short. It contains only http status codes, so not many different values. If multi metric can not split by numeric field, what is the best practice for fields like http response code? Is it best to store it as keyword or to store it both as keyword and as a numeric data types?


For this field it would be best to store as a keyword.

The driving factor for storing as numeric would be if you wanted to perform an aggregation on the field. In this case, I doubt sum(response_code) or average(response_code) could be meaningful.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.