Help with query. Remove top level buckets using child terms counts


(Brad Laney) #1

I cannot seem to write this query. It's very advanced for what I am used to doing. Here is a pseudo representation of what I am trying to accomplish:

terms SerialNumber
  terms SuccessCode
only keeps terms if SerialNumber.SuccessCode.OK.Count == 0 && SerialNumber.SuccessCode.SpecificError.Count > 0

Here is an example data

{ SerialNumber: 1, SuccessCode: "OK" }
{ SerialNumber: 1, SuccessCode: "SpecificError" }
{ SerialNumber: 2, SuccessCode: "OK" }
{ SerialNumber: 3, SuccessCode: "OtherErrorIDontCareAbout" }
{ SerialNumber: 4, SuccessCode: "SpecificError" }

The output needs to simply give the client this record only
{ SerialNumber: 4 }

This is because
SerialNumber 1 has a SuccessCode of "OK" appearing at least once
SerialNumber 2 has a SuccessCode of "OK" appearing at least once
SerialNumber 3 has no SuccessCode of "SpecificError" appearing at least once


So far what I tried to do was a top level terms over SerialNumber, then a sub aggs with filter over SuccessCode OK, and a filter over SuccessCode SpecificError. Then I added bucket_script to pull the OK.Count (using filter_name._count) and then the same way for SpecificError.Count. Then above that I tried adding a bucket_selector at the top level to compare the two script variables I created with ok_count == 0 && specific_error_count > 0. But this fails because you cannot have a top level pipeline aggregate (bucket_selector) beside a bucket aggregate (terms).

Anyone have a clue? Below is what I have so far.

{
  "size": 0,
  "aggs": {
    "serial_number_terms": {
      "terms": {
        "field": "SerialNumber"
      },
      "aggs": {
        "ok_filter": {
        "filter": { "match": { "SuccessCode": "OK" } }
        },
        "ok_count": {
          "bucket_script": {
            "buckets_path": {
              "var1": "ok_filter>_count"
            },
            "script": "params.var1"
          }
        },
        "fail_filter": {
          "filter": { "match": { "SuccessCode": "SpecificError" } }
        },
        "fail_count": {
          "bucket_script": {
            "buckets_path": {
              "var1": "fail_filter>_count"
            },
            "script": "params.var1"
          }
        }
      }
    },
    "top_level_filter": {
      "bucket_selector": {
        "buckets_path": {
          "var1": "serial_number_terms>ok_count",
          "var2": "serial_number_terms>fail_count"
        },
        "script": "var1 == 0 && var2 > 0"
      }
    }
  }
}

The sql would be:

select coalesce(t1.SerialNumber, t2.SerialNumber) SerialNumber
from
(select SerialNumber, count(*) cnt from logs
group by SerialNumber
where SuccessCode = 'OK') t1
full outer join
(select SerialNumber, count(*) cnt from logs
group by SerialNumber
where SuccessCode = 'SpecificError') t2
on t1.SerialNumber = t2.SerialNumber
where coalesce(t1.cnt, 0) = 0 and t2.cnt > 0