Datafeed creation via API

Hi,

I have created 2 identical datafeed:

  • 1 datafeed created via Kibana (datafeed_job_device1)
  • 1 datafeed created via API (api_datafeed_api_job_device1)

The 2 resulting JSON files related to these 2 datafeed are exactly the same but:

  • GET _ml/datafeeds/datafeed_job_device1/_preview provides search results
  • GET _ml/datafeeds/api_datafeed_api_job_device1/_preview is empty

Hereafter the JSON related to the datafeed (only datafeed_id is different when created via Kibana)

{
  "count" : 1,
  "datafeeds" : [
    {
      "datafeed_id" : "api_datafeed_api_job_device1",
      "job_id" : "api_job_device1",
      "query_delay" : "100488ms",
      "chunking_config" : {
        "mode" : "manual",
        "time_span" : "90000000ms"
      },
      "indices_options" : {
        "expand_wildcards" : [
          "open"
        ],
        "ignore_unavailable" : false,
        "allow_no_indices" : true,
        "ignore_throttled" : true
      },
      "query" : {
        "bool" : {
          "should" : [
            {
              "match_phrase" : {
                "device.name.keyword" : "device1"
              }
            }
          ],
          "minimum_should_match" : 1,
          "filter" : [ ],
          "must_not" : [ ]
        }
      },
      "indices" : [
        "lan_metrics*"
      ],
      "aggregations" : {
        "buckets" : {
          "date_histogram" : {
            "field" : "@timestamp",
            "fixed_interval" : "90000ms"
          },
          "aggregations" : {
            "resources.memory.used.percent" : {
              "max" : {
                "field" : "resources.memory.used.percent"
              }
            },
            "@timestamp" : {
              "max" : {
                "field" : "@timestamp"
              }
            }
          }
        }
      },
      "scroll_size" : 1000,
      "delayed_data_check_config" : {
        "enabled" : true
      }
    }
  ]
}

Any ideas?

Hi Stephane,

Just to eliminate any potential mistakes when comparing the different JSON configs, could you please supply the datafeed config for datafeed_job_device1, in the same way you did for api_datafeed_job_device1 above.
Using the command
GET _ml/datafeeds/<datafeed ID>

Also could you please supply the job configs for each of these datafeeds.
GET _ml/anomaly_detectors/<job ID>

Thanks,
James

Hi @James_Gowdy

and thanks for having a look to this topic !

Please find hereafter the requested information (Kibana/API datafeed + Kibana/API jobs):

  • Kibana datafeed : GET _ml/datafeeds/datafeed_job_device1 :
{
  "count" : 1,
  "datafeeds" : [
    {
      "datafeed_id" : "datafeed_job_device1",
      "job_id" : "job_device1",
      "query_delay" : "84790ms",
      "chunking_config" : {
        "mode" : "manual",
        "time_span" : "90000000ms"
      },
      "indices_options" : {
        "expand_wildcards" : [
          "open"
        ],
        "ignore_unavailable" : false,
        "allow_no_indices" : true,
        "ignore_throttled" : true
      },
      "query" : {
        "bool" : {
          "should" : [
            {
              "match_phrase" : {
                "device.name.keyword" : "device1"
              }
            }
          ],
          "minimum_should_match" : 1,
          "filter" : [ ],
          "must_not" : [ ]
        }
      },
      "indices" : [
        "lan_metrics*"
      ],
      "aggregations" : {
        "buckets" : {
          "date_histogram" : {
            "field" : "@timestamp",
            "fixed_interval" : "90000ms"
          },
          "aggregations" : {
            "resources.memory.used.percent" : {
              "max" : {
                "field" : "resources.memory.used.percent"
              }
            },
            "@timestamp" : {
              "max" : {
                "field" : "@timestamp"
              }
            }
          }
        }
      },
      "scroll_size" : 1000,
      "delayed_data_check_config" : {
        "enabled" : true
      }
    }
  ]
}

  • API datafeed : GET _ml/datafeeds/api_datafeed_api_job_device1 :
{
  "count" : 1,
  "datafeeds" : [
    {
      "datafeed_id" : "api_datafeed_api_job_device1",
      "job_id" : "api_job_device1",
      "query_delay" : "100488ms",
      "chunking_config" : {
        "mode" : "manual",
        "time_span" : "90000000ms"
      },
      "indices_options" : {
        "expand_wildcards" : [
          "open"
        ],
        "ignore_unavailable" : false,
        "allow_no_indices" : true,
        "ignore_throttled" : true
      },
      "query" : {
        "bool" : {
          "should" : [
            {
              "match_phrase" : {
                "device.name.keyword" : "device1"
              }
            }
          ],
          "minimum_should_match" : 1,
          "filter" : [ ],
          "must_not" : [ ]
        }
      },
      "indices" : [
        "lan_metrics*"
      ],
      "aggregations" : {
        "buckets" : {
          "date_histogram" : {
            "field" : "@timestamp",
            "fixed_interval" : "90000ms"
          },
          "aggregations" : {
            "resources.memory.used.percent" : {
              "max" : {
                "field" : "resources.memory.used.percent"
              }
            },
            "@timestamp" : {
              "max" : {
                "field" : "@timestamp"
              }
            }
          }
        }
      },
      "scroll_size" : 1000,
      "delayed_data_check_config" : {
        "enabled" : true
      }
    }
  ]
}

  • Kibana job : GET _ml/anomaly_detectors/job_device1 :
{
  "count" : 1,
  "jobs" : [
    {
      "job_id" : "job_device1",
      "job_type" : "anomaly_detector",
      "job_version" : "7.12.1",
      "create_time" : 1625646576992,
      "finished_time" : 1625738419900,
      "model_snapshot_id" : "1625738418",
      "groups" : [
        "forecast",
        "stephane"
      ],
      "description" : "",
      "analysis_config" : {
        "bucket_span" : "15m",
        "summary_count_field_name" : "doc_count",
        "detectors" : [
          {
            "detector_description" : """max("resources.memory.used.percent")""",
            "function" : "max",
            "field_name" : "resources.memory.used.percent",
            "detector_index" : 0
          }
        ],
        "influencers" : [ ]
      },
      "analysis_limits" : {
        "model_memory_limit" : "11mb",
        "categorization_examples_limit" : 4
      },
      "data_description" : {
        "time_field" : "@timestamp",
        "time_format" : "epoch_ms"
      },
      "model_plot_config" : {
        "enabled" : true,
        "annotations_enabled" : true
      },
      "model_snapshot_retention_days" : 10,
      "daily_model_snapshot_retention_after_days" : 1,
      "results_index_name" : "custom-job_device1",
      "allow_lazy_open" : false
    }
  ]
}
  • API job : GET _ml/anomaly_detectors/api_ job_device1 :
{
  "count" : 1,
  "jobs" : [
    {
      "job_id" : "api_job_device1",
      "job_type" : "anomaly_detector",
      "job_version" : "7.12.1",
      "create_time" : 1625668327864,
      "finished_time" : 1625669597160,
      "description" : "API job on resources.memory",
      "analysis_config" : {
        "bucket_span" : "15m",
        "summary_count_field_name" : "doc_count",
        "detectors" : [
          {
            "detector_description" : """max("resources.memory.used.percent")""",
            "function" : "max",
            "field_name" : "resources.memory.used.percent",
            "detector_index" : 0
          }
        ],
        "influencers" : [ ]
      },
      "analysis_limits" : {
        "model_memory_limit" : "20mb",
        "categorization_examples_limit" : 4
      },
      "data_description" : {
        "time_field" : "timestamp",
        "time_format" : "epoch_ms"
      },
      "model_snapshot_retention_days" : 10,
      "daily_model_snapshot_retention_after_days" : 1,
      "results_index_name" : "custom-api_job_device1",
      "allow_lazy_open" : false
    }
  ]
}

Please find also:

  • job messages related to api_job_device1:

  • datafeed stats related to api_datafeed_api_job_device1:

datafeed_stats

Hi Stephane,

Looking at the two job configs, it looks like the time_field for the job api_ job_device1 might be wrong.
"time_field" : "timestamp",

Whereas job_device1 uses:
"time_field" : "@timestamp",

James

1 Like

Thanks James

and sorry to have bothered you for this kind of mistake...
I didn't know that the datafeed preview was directly linked to the job configuration.

By applying this correction on timestamp, the datafeed and the associated job created via API work fine now.

Thanks again