ML Multi-Metric query fails when similar Single-Metric is OK

I'm starting to play with ML jobs, but encountered an issue with Multi-Metric job.
It fails with

Datafeed is encountering errors extracting data: [ml-multi-low-count-test]
Search request returned shard failures; first failure: shard [[YuBCwg][logstash-general-2017.08.09][0]], reason [RemoteTransportException[[elasticsearch][127.0.0.1:9300]
[indices:data/read/search[phase/query]]]; nested: QueryShardException[No mapping found for [@timestamp] in order to sort on]; ]; see logs for more info

Single-Metric job with the same indexes works nice. But as far as I see, time_field is the same in both cases:

"data_description": {
"time_field": "@timestamp",
"time_format": "epoch_ms"

Any ideas how to debug this?

Hi Vitaly,

Please do the following for both jobs (the working single-metric and the non-working multi-metric)

curl -u elastic:changeme -XGET 'localhost:9200/_xpack/ml/anomaly_detectors/<job_id>?pretty'
curl -u elastic:changeme -XGET 'localhost:9200/_xpack/ml/datafeeds/datafeed-<job_id>?pretty'

Then we can compare the two configurations

Thank you, please see below:

{
"count" : 1,
"jobs" : [
{
"job_id" : "ml-test-single-metric-low-count",
"job_type" : "anomaly_detector",
"job_version" : "5.5.2",
"description" : "2nd 24.08.2017",
"create_time" : 1503565565772,
"finished_time" : 1503565567428,
"analysis_config" : {
"bucket_span" : "5m",
"summary_count_field_name" : "doc_count",
"detectors" : [
{
"detector_description" : "low_count",
"function" : "low_count",
"detector_rules" : [ ],
"detector_index" : 0
}
],
"influencers" : [ ]
},
"data_description" : {
"time_field" : "@timestamp",
"time_format" : "epoch_ms"
},
"model_plot_config" : {
"enabled" : true
},
"model_snapshot_retention_days" : 1,
"model_snapshot_id" : "1503828750",
"results_index_name" : "shared"
}
]
}
2)
{
"count" : 1,
"jobs" : [
{
"job_id" : "ml-multi-low-count-test",
"job_type" : "anomaly_detector",
"job_version" : "5.5.2",
"description" : "3rd",
"create_time" : 1503568129335,
"finished_time" : 1503568130891,
"analysis_config" : {
"bucket_span" : "5m",
"detectors" : [
{
"detector_description" : "low_count",
"function" : "low_count",
"partition_field_name" : "type.keyword",
"detector_rules" : [ ],
"detector_index" : 0
}
],
"influencers" : [
"type.keyword"
]
},
"data_description" : {
"time_field" : "@timestamp",
"time_format" : "epoch_ms"
},
"model_snapshot_retention_days" : 1,
"results_index_name" : "shared"
}
]
}

Vitaly - Please also collect the output from the data feed config for each job. See above.

Thank you, here it is:

{
"count" : 1,
"datafeeds" : [
{
"datafeed_id" : "datafeed-ml-multi-low-count-test",
"job_id" : "ml-multi-low-count-test",
"query_delay" : "60s",
"frequency" : "150s",
"indices" : [
"logstash-*"
],
"types" : [
"newrelic-nrsysmond",
"bvpyzabbix",
"rabbitmq",
........
],
"query" : {
"match_all" : {
"boost" : 1.0
}
},
"scroll_size" : 1000,
"chunking_config" : {
"mode" : "auto"
}
}
]
}

^^ missing the one for the single metric job

My guess is that it's possible that in the multi-metric job, there are multiple data "types" in your "logstash-*" which have a different timestamp field from each other?

Just so you know - "types" are being depcreated by Elasticsearch in v6.0. Just so you're prepared...

Just so you know - “types” are being depcreated by Elasticsearch in v6.0.
Just so you’re prepared…
Thank you :slight_smile:
BTW, there are about 20 types, I just cutted out.

it was in my previous letter, after separator.
here it is:

{
"count" : 1,
"datafeeds" : [
{
"datafeed_id" : "datafeed-ml-test-single-metric-low-count",
"job_id" : "ml-test-single-metric-low-count",
"query_delay" : "60s",
"frequency" : "150s",
"indices" : [
"logstash-*"
],
"types" : [
"newrelic-nrsysmond",
.........
],
"query" : {
"match_all" : {
"boost" : 1.0
}
},
"aggregations" : {
"buckets" : {
"date_histogram" : {
"field" : "@timestamp",
"interval" : 300000,
"offset" : 0,
"order" : {
"_key" : "asc"
},
"keyed" : false,
"min_doc_count" : 0
},
"aggregations" : {
"@timestamp" : {
"max" : {
"field" : "@timestamp"
}
}
}
}
},
"scroll_size" : 1000,
"chunking_config" : {
"mode" : "manual",
"time_span" : "300000000ms"
}
}
]
}

Right so the types above ^^ are extraneous compared to the config of the single-metric job (which works for you). Clone your existing multi-metric job (to keep most of the config parameters) but remove the above extraneous types, then try to run the job....

Sorry for confusing output - "types" in both cases are the same. I just
reduced output in different way (in original output there are 50 types)

Oh ok! That's good to know.

I think that you must have one (or more) of those 50 types with a missing mapping for @timestamp (??)

In the single-metric job, the query to the index automatically includes a date_histogram aggregation on the field @timestamp (as you can see above). The multi-metric job does not do this. So, perhaps, the single-metric job's aggregation is masking the problem in your data?

I can also suggest that you inspect the elasticsearch.log file when hitting the datafeed "preview" for the problematic job:

GET _xpack/ml/datafeeds/datafeed-ml-multi-low-count-test/_preview/

Thank you, it really was one of the indices without proper timestamp mapping. Job is running as far as we don't include this index.
Vitaly

1 Like

Rich,

I'll appreciate your help again. After further investigation it seems that ML job fails on indices without proper @timestamp filed mapping, but on empty indices.

Job fails with this message:

Datafeed is encountering errors extracting data: [ml-multicount-all-indices-count-test] Search request returned shard failures; first failure: shard [[pREMVKEzTWe-J3vLYuBCwg][logstash-general-2017.09.02][0]], reason [RemoteTransportException[[staging-elk-elasticsearch-][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[No mapping found for [@timestamp] in order to sort on]; ]; see logs for more info

This index is empty, and mapping is:

"logstash-general-2017.09.02": {
"mappings": {
"default": {
"properties": {
"@timestamp": {
"type": "date"
},
"geoip": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}

Vitaly,

Under what conditions do you have empty indices? That shouldn't be the case if you're using daily indices.

Also, I'll let you know again that best practices, you should separate different kinds of data to different indices. In v6.0, Elastic will be deprecating the "_type" mechanism, which allows different types of data to exist in the same physical index.

As far as I see, logstash output filter send wrong formatted data - like two comma-separated strings in "type" field.

Yes, I remember. But AFAIK Elastic performance may be affected from a large number of shards?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.