How to create machine learning job to compare two counts?

Hello. I have an index with application logs from all our servers. The load to the servers is load balanced and I need to create a machine learning job to compare requests count between the servers. Indexed data:

SERVER1 | some request
SERVER2 | some request
SERVER2 | some request
SERVER2 | some request
SERVER1 | some request

I need to create a job to compare the counts between the SERVER1 and SERVER2. In this example there are 2 requests for SERVER1 and 3 requests for SERVER2. How to create such a machine learning job?


You could do this easily if the counts were summarized for the two servers in the same document - you would then just use a scripted field to create the difference of the two fields on the fly.

That's probably not the case here, however, since you imply each line is a separate document. Instead, this is likely handled by using a nested aggregation for the Elasticsearch query. By default, ML jobs just do a "match_all" query but this query can be ultimately customized, if you select an advanced job. I can imagine using a date_histogram aggregation on the timestamp field in your data (with an interval equal to the bucket_span of the job) and then using a combination of a terms filter, plus a bucket script aggregation to be the place where the counts of the two servers are created, then subtracted from each other. This new value is what you'd have the ML job operate on.

The query would look something like this (using a similar example on web access logs):


Notice that the new field "diff.value" (on the right) contains the differences in the number of GETs minus the number of POSTs in this index of web log in 5 minute bucket intervals.

However, I have realized that this type of complex aggregations gets ignored by an ML job in v5.5. Apparently, there's a provision in v5.6 (coming soon around the end of August, early Sept) that will allow it.

Let's revisit this issue at that time and see if we can get this working for you!

1 Like

Great, thanks very much. I will look forward for 5.6 tests.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.