Expected numeric type on field [...], but got [keyword]

machine-learning

#1

In 6.2.4 I have a machine learning job with a summary count that is set to a keyword field. It runs just fine. In the anomaly explorer I select an anomaly and click on Open Link / View series. That opens a new window that hangs forever with a Loading spinner.

In the elasticsearch log I see it failing to do the aggs with a sum on the keyword field. Not very friendly.

[2018-05-15T15:43:59,193][DEBUG][o.e.a.s.TransportSearchAction] [...] [nypd-complaints][0], node[Sup8CgURSpCU7gKD-NwAWA], [P], s[STARTED], a[id=CjolmCJBSn-WLYhYzbJR3A]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[nypd-complaints], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=5, batchedReduceSize=512, preFilterShardSize=128, source={"size":0,"query":{"bool":{"must":[{"range":{"@timestamp":{"from":1135296000000,"to":1485215999999,"include_lower":true,"include_upper":true,"format":"epoch_millis","boost":1.0}}},{"match_all":{"boost":1.0}},{"query_string":{"query":"Borough.keyword:\"STATEN ISLAND\"","fields":[],"type":"best_fields","default_operator":"or","max_determinized_states":10000,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"analyze_wildcard":false,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":[],"excludes":[]},"aggregations":{"byTime":{"date_histogram":{"field":"@timestamp","interval":"1M","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0},"aggregations":{"metric":{"sum":{"field":"OffenseCode.keyword"}}}}}}}]
org.elasticsearch.transport.RemoteTransportException: [...][127.0.0.1:9300][indices:data/read/search[phase/query]]
Caused by: java.lang.IllegalArgumentException: Expected numeric type on field [OffenseCode.keyword], but got [keyword]
        at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.numericField(ValuesSourceConfig.java:307) ~[elasticsearch-6.2.4.jar:6.2.4]

and

[2018-05-15T15:43:59,201][DEBUG][o.e.a.s.TransportSearchAction] [...] All shards failed for phase: [query]
org.elasticsearch.ElasticsearchException$1: Expected numeric type on field [OffenseCode.keyword], but got [keyword]
        at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:619) ~[elasticsearch-6.2.4.jar:6.2.4]

(David Kyle) #2

Hi Badger,

Can you share the full job configuration please


(Pete Harverson) #3

Hi Badger,

The charts in the Single Metric Viewer and the Anomaly Explorer will only work if the summary count field is a numeric type, as the elasticsearch aggregations they use to obtain the data for the chart plots will only work on numeric fields.

The machine learning engine will handle a keyword field as a summary count field as they convert the value internally from strings to numeric, but as you have found the charts in the Single Metric Viewer and Anomaly Explorer do not handle this gracefully. We will fix this so that if a non numeric field is used as a summary count field, the link to the Single Metric Viewer is disabled, and no attempt is made to chart the data in the Anomaly Explorer.

If you could share the full job configuration (copy the configuaration from the JSON tab of the job in the Job Management page) we may be able to suggest an alternative configuration which will allow the charts to be plotted. Or could you change the mapping type of the field you are using as the summary count field?

Thanks
Pete


#4

Actually I should not have been using a summary count for this job. Anyways, this is what the job looks like

{"job_id":"test","job_type":"anomaly_detector","job_version":"6.2.4","description":"","create_time":1526477474122,"finished_time":1526478331845,"established_model_memory":54224,"analysis_config":{"bucket_span":"12h","summary_count_field_name":"OffenseCode.keyword","detectors":[{"detector_description":"count over \"Borough.keyword\"","function":"count","over_field_name":"Borough.keyword","rules":[],"detector_index":0}],"influencers":[]},"analysis_limits":{"model_memory_limit":"1024mb"},"data_description":{"time_field":"@timestamp","time_format":"epoch_ms"},"model_snapshot_retention_days":1,"model_snapshot_id":"1526478331","results_index_name":"shared","data_counts":{"job_id":"test","processed_record_count":5580035,"processed_field_count":11159607,"input_bytes":476683499,"input_field_count":11159607,"invalid_date_count":0,"missing_field_count":463,"out_of_order_timestamp_count":0,"empty_bucket_count":4017,"sparse_bucket_count":0,"bucket_count":8034,"earliest_record_timestamp":1136091600000,"latest_record_timestamp":1483160400000,"last_data_time":1526478321986,"latest_empty_bucket_timestamp":1483099200000,"input_record_count":5580035},"model_size_stats":{"job_id":"test","result_type":"model_size_stats","model_bytes":54224,"total_by_field_count":3,"total_over_field_count":5,"total_partition_field_count":2,"bucket_allocation_failures_count":0,"memory_status":"ok","log_time":1526478331000,"timestamp":1483099200000},"datafeed_config":{"datafeed_id":"datafeed-test","job_id":"test","query_delay":"99564ms","indices":["nypd-complaints"],"types":[],"query":{"match_all":{"boost":1}},"scroll_size":1000,"chunking_config":{"mode":"auto"},"state":"stopped"},"state":"closed"}

(Pete Harverson) #5

Yes, looks like there isn't any need to use a summary count field for this job. Looks like count by OffenseCode.keyword over Borough.keyword is what you need here?

Pete


#6

Yes indeed.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.