So when I ran these two separate queries, the results looked good.
# Failed logins
GET <shard>/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"match_phrase": {
"request": "/login"
}
},
{
"match": {
"response": 401
}
}
],
"minimum_should_match": "100%"
}
}
]
}
}
}
# All other requests to the app
GET <shard>/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": {
"match": {
"app": "appName"
}
},
"must_not": {
"bool": {
"should": [
{
"match_phrase": {
"request": "/login"
}
}
],
"minimum_should_match": "100%"
}
}
}
}
]
}
}
}
I'm wondering if the script portion when running them in the aggregation query is the problem?
Here is the whole thing based off your example:
GET <shard>/_search
{
"size": 0,
"aggs": {
"ips": {
"terms": {
"field": "clientip.keyword",
"size": 10000
},
"aggs": {
"not_login_url": {
"filter": {
"bool": {
"filter": [
{
"bool": {
"must": {
"match": {
"app": "appName"
}
},
"must_not": {
"bool": {
"should": [
{
"match_phrase": {
"request": "/login"
}
}
],
"minimum_should_match": "100%"
}
}
}
}
]
}
}
},
"login_url": {
"filter": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"match_phrase": {
"request": "/login"
}
},
{
"match": {
"response": 401
}
}
],
"minimum_should_match": "100%"
}
}
]
}
}
},
"bots": {
"bucket_selector": {
"buckets_path": {
"login_url": "login_url._count",
"not_login_url": "not_login_url._count"
},
"script": "params.login_url > 10 && params.not_login_url == 0"
}
}
}
}
}
}
When I removed the params.not_login_url == 0
portion of the script, I got no output:
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ips" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 4368487,
"buckets" : [ ]
}
}
When I removed the params.login_url > 10
portion, I got an output of lots of buckets, but they all look like
"key" : "<IP>",
"doc_count" : <someNumber>,
"login_url" : {
"doc_count" : 0
},
"not_login_url" : {
"doc_count" : 0
}
So for some reason params.not_login_url
is never 0, and params.login_url
is frequently above 10, but the separate aggregations are not working.
Any ideas?