Machine Learning datafeed for egress traffic

Hi,

I'm trying to create a datafeed for a ML job which would detect suspicious egress traffic per hosts.

PUT _xpack/ml/datafeeds/datafeed-network_out_deriv/
{
  "job_id": "network_out_deriv",
  "indices": [
    "metricbeat-*"
  ],
  "query": {
      "bool": {
        "filter": [
          {
            "term": {
              "host.name": "qa-control"
            }
          }
        ],
        "must": {
          "exists": {
            "field": "system.network.out.bytes"
          }
        }
      }
    },
  "aggregations": {
    "buckets": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "10s",
        "time_zone": "UTC"
      },
      "aggregations": {
        "@timestamp": {
          "max": {
            "field": "@timestamp"
          }
        },
        "host.name": {
          "terms": {
            "field": "host.name"
          }
        },
        "network_out": {
          "max": {
            "field": "system.network.out.bytes"
          }
        },
        "network_out_deriv": {
          "derivative": {
            "buckets_path": "network_out"
          }
        }
      }
    }
  }
}

The above inserted datafeed is working but as you can see the hostname is hardcoded there. The problem is that I dont know how to create another aggeregation for hostnames (and partition the ML job by that) while I keep using the derivative function. As soon as I add another aggeregation the derivative function stops working, see below:

GET /_search 
{
  "aggregations": {
    "buckets": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1000m",
        "time_zone": "UTC"
      },
      "aggregations": {
        "@timestamp": {
          "max": {
            "field": "@timestamp"
          }
        },
        "aggs": {
          "terms": {
            "field": "host.name"
          },
          "aggs": {
            "network_out": {
              "max": {
                "field": "system.network.out.bytes"
              }
            },
            "network_out_deriv": {
              "derivative": {
                "buckets_path": "network_out"
              }
            }
          }
        }
      }
    }
  }
}

Receiving this error:

    "type" : "illegal_state_exception",
    "reason" : "derivative aggregation [network_out_deriv] must have a histogram, date_histogram or auto_date_histogram as parent"

If anybody could give me some tips how to proceed I would really appreciate

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.