Created ML Job with bucket span 15m and 1d

Hi All,

I have created ML job of multi metric with bucket span of 15m and used the function Max(A), Max(B). Splitting it by hostname and also created other Job with bucket span of 1d. I would like to know how exactly bucket span works? does it consider maximum value of A in 15m or 1d per host. How the anomaly score is calculated based on bucket span? can anyone explain me in detail?

Thanks in advance!

We have this blog post which covers bucket spans: https://www.elastic.co/blog/explaining-the-bucket-span-in-machine-learning-for-elasticsearch

Let us know if that answers your questions or if you need more information.

Best,
Walter

1 Like

Thanks for your response.
I have one doubt, If the data is less to analyze then bucket span should be more?

This depends on the use case. ML can work with sparse data too. But if you expect your data to be non-sparse in a given time frame then it makes sense to tweak the bucket span in that regard.

You can use the Single Metric Wizard to experiment with different bucket spans.

For example, this dataset shows gaps when the bucket span is only 1 second:

Changing the bucket span to 1 minute for the same dataset results in a continuous line:

Note that 1 second or 1 minute are not necessarily good or bad bucket spans in general, it depends on the type of data you have and the patterns you expect to emerge. The "Estimate bucket span" button provides a helper function that will try to come up with a reasonable bucket span by analysing the source data. It is available in all job wizards.

I created a multi-metric job with 1 month flow of data and the processed records are 10x,xxx,xxx but the metric viewer looks like this

output after viewing results

I created single- metric too

Can you let me know why?

It's hard to tell this way what's wrong, can you also provide a screenshot of the results you're seeing in the previews in the job creation wizards? In addition to that it would be useful if you could post a sample document you're analyzing as well as the resulting job config JSON. Please also explain the use case you're working on, that will help me making better suggestions. Thanks, Walter

Hey, I am analyzing the avg of total cpu utilization on the fields system.cpu.total.pct and system.cpu.total.norm.pct by using metricbeat data. There is metric viewer for system.cpu.total.pct but not to system.cpu.total.norm.pct .

Job Config Json

{
"job_id": "total",
"job_type": "anomaly_detector",
"job_version": "6.4.1",
"description": "",
"create_time": 1540299791986,
"finished_time": 1540303801057,
"established_model_memory": 2687544,
"analysis_config": {
"bucket_span": "15m",
"detectors": [
{
"detector_description": "mean(system.cpu.total.pct)",
"function": "mean",
"field_name": "system.cpu.total.pct",
"partition_field_name": "beat.hostname",
"detector_index": 0
},
{
"detector_description": "mean(system.cpu.total.norm.pct)",
"function": "mean",
"field_name": "system.cpu.total.norm.pct",
"partition_field_name": "beat.hostname",
"detector_index": 1
}
],
"influencers": [
"beat.hostname"
]
},
"analysis_limits": {
"model_memory_limit": "17mb",
"categorization_examples_limit": 4
},
"data_description": {
"time_field": "@timestamp",
"time_format": "epoch_ms"
},
"model_snapshot_retention_days": 1,
"custom_settings": {
"created_by": "multi-metric-wizard"
},
"model_snapshot_id": "1540303799",
"results_index_name": "shared",
"data_counts": {
"job_id": "total",
"processed_record_count": 109742474,
"processed_field_count": 113691080,
"input_bytes": 7809229831,
"input_field_count": 113691080,
"invalid_date_count": 0,
"missing_field_count": 215536342,
"out_of_order_timestamp_count": 0,
"empty_bucket_count": 31,
"sparse_bucket_count": 3,
"bucket_count": 2356,
"earliest_record_timestamp": 1538179200045,
"latest_record_timestamp": 1540299624572,
"last_data_time": 1540303799182,
"latest_empty_bucket_timestamp": 1538748900000,
"latest_sparse_bucket_timestamp": 1538720100000,
"input_record_count": 109742474
},
"model_size_stats": {
"job_id": "total",
"result_type": "model_size_stats",
"model_bytes": 2687544,
"total_by_field_count": 94,
"total_over_field_count": 0,
"total_partition_field_count": 93,
"bucket_allocation_failures_count": 0,
"memory_status": "ok",
"log_time": 1540303799000,
"timestamp": 1540298700000
},
"datafeed_config": {
"datafeed_id": "datafeed-total",
"job_id": "total",
"query_delay": "116810ms",
"indices": [
"metricbeat-*"
],
"types": ,
"query": {
"match_all": {
"boost": 1
}
},
"scroll_size": 1000,
"chunking_config": {
"mode": "auto"
},
"state": "stopped"
},
"state": "closed"
}
Job results

No results found for system.cpu.total.norm.pct why?

As far as I can tell, this is not an issue with the Machine Learning plugin by itself but rather with the data you have at hand. I just tested this with a default installation of metricbeat and there is simply no data saved to the field system.cpu.total.norm.pct.

metricbeat needs to be explicitly configured to write to that field, have a look at the docs here: https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-system-cpu.html#_configuration_3

once metricbeat is configured to log normalized_percentages that field should be populated, here's an example:

As you can see in the screenshot above, the preview charts in the job creation wizard can help you identify if you're about to analyze the expected data, so if you don't see any data showing up there, a Machine Learning job run with that configuration will not return any results. So these preview charts, alongside with Machine Learning's Data Visualizer (https://www.elastic.co/blog/machine-learning-data-visualizer-and-modules) can help you identify the characteristics of your source data.

Thank you, but I can see this field in discover. how is that possible?

As far as I can see, the field you selected in your last screenshot from Discover is a different one (system.process.cpu.total.norm.pct) compared to the ones you used in the Machine Learning job configs (system.cpu.total.norm.pct). The "process" based one is available by default.

yeah sorry, got confused. I made changes in system.yml file it worked.
Thanks your help

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.