I suppose use transform
and create another index is a best way. I will explain.
To know IP 192.168.1.1 has both Port 22 and 3389, you need to aggregate by IP. Even in Elasticsearch, to query or filter on aggregated buckets are really restricted. The only way is bucket selector aggregation. It is not as free as query / filter on documents.
If you want to know the count on Elasticsearch, this query:
GET /test_bucket_selecter/_search?filter_path=aggregations.stats.count,aggregations.hosts.buckets.key
{
"query":{
"bool":{
"should":[
{"term":{"Port":"22"}},
{"term":{"Port":"3389"}}
]
}
},
"size":0,
"aggs":{
"hosts":{
"multi_terms":{
"terms":[
{"field":"IP"},
{"field":"OS"}
]
},
"aggs":{
"port_cardinality":{
"cardinality":{
"field":"Port"
}
},
"cardinality_filter":{
"bucket_selector": {
"buckets_path": {
"portCardinality": "port_cardinality"},
"script": "params.portCardinality >= 2"
}
}
}
},
"stats":{
"stats_bucket": {
"buckets_path": "hosts>port_cardinality"
}
}
}
}
gives you:
{
"aggregations" : {
"hosts" : {
"buckets" : [
{
"key" : [
"192.168.1.1",
"Windows 10"
]
}
]
},
"stats" : {
"count" : 1
}
}
}
but such complex query is not achivable with "aggregation based > Data Table".
As my recent post, you can create transform
to get an index containing documents like:
{
"IP" : "192.168.1.1"
"OS" : [
"Windows 10"
],
"Port" : [
22,
3389
]
}
With this structure documents, your query is easily achieved and more free to visualize in dashboard.
Sample transform
is:
PUT _transform/test_ip_transform
{
"pivot":{
"group_by":{
"IP":{
"terms":{
"field":"IP"
}
}
},
"aggs":{
"Port":{
"scripted_metric": {
"init_script": "state.set = new HashSet()",
"map_script": "if (params['_source'].containsKey(params.field)) {state.set.add(params['_source'][params.field])}",
"combine_script": "return state.set",
"reduce_script": "def ret = new HashSet(); for (s in states) {for (k in s) {ret.add(k);}} return ret",
"params":{
"field": "Port"
}
}
},
"OS":{
"scripted_metric": {
"init_script": "state.set = new HashSet()",
"map_script": "if (params['_source'].containsKey(params.field)) {state.set.add(params['_source'][params.field])}",
"combine_script": "return state.set",
"reduce_script": "def ret = new HashSet(); for (s in states) {for (k in s) {ret.add(k);}} return ret",
"params":{
"field": "OS"
}
}
}
}
},
"source": {
"index":"test_ip"
},
"dest":{
"index": "test_ip_transformed"
},
"frequency": "1m"
}
POST _transform/test_ip_transform/_start
Then you get:
GET /test_ip_transformed/_search?filter_path=hits.hits._source
{
"hits" : {
"hits" : [
{
"_source" : {
"OS" : [
"Windows 10"
],
"IP" : "192.168.1.1",
"Port" : [
22,
3389
]
}
},
{
"_source" : {
"OS" : [
"Windows 2008 R2"
],
"IP" : "192.168.1.2",
"Port" : [
22,
110
]
}
}
]
}
}
With this structure, you can easily query and visualize what you want.
Take care about transform limitations.