I am trying to set-up Machine Learning in Kibana using the X-pack plugin.
I uploaded some test data in the following format:
"format": "YYYY-MM-dd HH:mm:ss.SSS z"
request.start_timestamp:July 1st 2017, 20:11:27 property.site_id:86 property.country_code:GB property.locale:en_GB device.sess_cookie:96f19e37-7ecf-4f3e-a633-ca96e8559be2 _id:20ac1849-b1d2-4caf-98c8-4d43c79dec84 _type:anomaly _index:anomaly3_detection _score:1`
When I go to Machine Learning, try to create a simple single metric model and press the 'use full index* data', I get the following:
The 'Run' button is also disabled.
Oddly enough, I am able to create an advanced job based on the same data, but apart from the model being very inaccurate, I get the following error at the top of my chart:
The log says the following:
Caused by: java.lang.IllegalStateException: value source config is invalid; must have either a field context or a script or marked as unwrapped
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.toValuesSource(ValuesSourceConfig.java:227) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:51) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:221) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:224) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:55) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$16(IndicesService.java:1130) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$18(IndicesService.java:1211) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:160) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:143) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:401) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:116) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1217) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1129) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:246) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:263) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:330) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:327) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:258) ~[?:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.5.0.jar:5.5.0]
... 25 more
Any idea why this is the case?
I have created an index following the mappings provided above, and I do not see the same errors.
One reason for this might be that the Single Metric job setup uses Kibana index patterns, whereas the Adv Config does not. Could you please try to refresh the index pattern and see if this resolves the problem.
Also, can you plot this data in a standard Kibana visualisation? If you try a line chart, plotting count with a date histogram x-axis of request.start_timestamp. This is close to the ML search. It is possible that you might get a more informative error if you try this.
The second screen shot showing "model bounds are not available" is an information message to explain that the job configured has not captured the model plot data. This is switched on by default in the Single Metric job setup, but must be manually enabled in the Adv Config. See https://www.elastic.co/guide/en/elasticsearch/reference/5.5/ml-job-resource.html#ml-apimodelplotconfig
We would need to understand a bit more about the data and the job config in order to comment on the data accuracy. If you continue to experience problems in this respect, please could you raise this in the X-Pack forum. This is monitored by the machine learning team, so we'll be able to help you better there.
I'm a little confused because while on one hand, your mappings for your
"format": "YYYY-MM-dd HH:mm:ss.SSS z"
but your example document shows a
start_timestamp that looks like:
July 1st 2017, 20:11:27
which doesn't match the mapping.
I did a quick mockup using your example mappings, but I had to change your example document so that the
start_timestamp is of the correct format. After I insert a few dummy docs into that index, I tried to reproduce your problem but ML doesn't give me the same error that you describe when I try to create a job.
Did you specify in kibana index patterns that
start_timestamp is the time field for this index?
I looked up the error that is reported from Elasticsearch:
This is from the code that (from what I can tell) - attempts to build/validate an aggregation for a query (which the ML job you've selected will try to do behind the scenes). Part of that aggregated query ML attempts to make is one in which the time field for the index is used in a
date_histogram aggregation. (The advanced jobs in ML don't automatically attempt to create this type of aggregated query behind the scenes- thus that is why you're not getting this error when trying to create an Advanced job).
Clearly, there is something that your elasticsearch doesn't like about the time field when it attempts to build this aggregation, thus throwing the error. In other words, this is not really so much a problem with ML.
I'm not sure what to suggest to you as next steps, since I couldn't reproduce your problem even trying to match your setup as close as possible.
Thanks both for the quick reply. Rich, you found the issue. A small number of my records (hundreds out of millions), had the timestamp in a different format and that seems to have prevented the machine learning job to run.
You'd think there would be some validation or a more detailed error (given that Timelion worked and all other time-series graphs worked).
Glad to hear! Yeah, sorry that the error message was less than informative - the ML UI gets error messages like these directly from Elasticsearch (again, not really a specific ML error).
I'll ask the Elasticsearch team if there's a way to make this particular error message more informative.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.