I cannot seem to write this query. It's very advanced for what I am used to doing. Here is a pseudo representation of what I am trying to accomplish:
terms SerialNumber
terms SuccessCode
only keeps terms if SerialNumber.SuccessCode.OK.Count == 0 && SerialNumber.SuccessCode.SpecificError.Count > 0
Here is an example data
{ SerialNumber: 1, SuccessCode: "OK" }
{ SerialNumber: 1, SuccessCode: "SpecificError" }
{ SerialNumber: 2, SuccessCode: "OK" }
{ SerialNumber: 3, SuccessCode: "OtherErrorIDontCareAbout" }
{ SerialNumber: 4, SuccessCode: "SpecificError" }
The output needs to simply give the client this record only
{ SerialNumber: 4 }
This is because
SerialNumber 1 has a SuccessCode of "OK" appearing at least once
SerialNumber 2 has a SuccessCode of "OK" appearing at least once
SerialNumber 3 has no SuccessCode of "SpecificError" appearing at least once
So far what I tried to do was a top level terms over SerialNumber, then a sub aggs with filter over SuccessCode OK, and a filter over SuccessCode SpecificError. Then I added bucket_script to pull the OK.Count (using filter_name._count) and then the same way for SpecificError.Count. Then above that I tried adding a bucket_selector at the top level to compare the two script variables I created with ok_count == 0 && specific_error_count > 0. But this fails because you cannot have a top level pipeline aggregate (bucket_selector) beside a bucket aggregate (terms).
Anyone have a clue? Below is what I have so far.
{
"size": 0,
"aggs": {
"serial_number_terms": {
"terms": {
"field": "SerialNumber"
},
"aggs": {
"ok_filter": {
"filter": { "match": { "SuccessCode": "OK" } }
},
"ok_count": {
"bucket_script": {
"buckets_path": {
"var1": "ok_filter>_count"
},
"script": "params.var1"
}
},
"fail_filter": {
"filter": { "match": { "SuccessCode": "SpecificError" } }
},
"fail_count": {
"bucket_script": {
"buckets_path": {
"var1": "fail_filter>_count"
},
"script": "params.var1"
}
}
}
},
"top_level_filter": {
"bucket_selector": {
"buckets_path": {
"var1": "serial_number_terms>ok_count",
"var2": "serial_number_terms>fail_count"
},
"script": "var1 == 0 && var2 > 0"
}
}
}
}
The sql would be:
select coalesce(t1.SerialNumber, t2.SerialNumber) SerialNumber
from
(select SerialNumber, count(*) cnt from logs
group by SerialNumber
where SuccessCode = 'OK') t1
full outer join
(select SerialNumber, count(*) cnt from logs
group by SerialNumber
where SuccessCode = 'SpecificError') t2
on t1.SerialNumber = t2.SerialNumber
where coalesce(t1.cnt, 0) = 0 and t2.cnt > 0