Way to override the doc_count based aggregation


(Phani Kumar Varma) #1

{
"date": "2015-06-25T17:20:20",
"ip": "192.138.63.233",
"ec": "106023",
"sip": "160.141.161.137",
"sport": "57735",
"dip": "128.44.113.244",
"dport": "389",
"proto": "udp",
“bytesin”: 125,
“bytesout”: 250
}
{
"date": "2015-06-25T17:20:20",
"ip": "192.138.63.233",
"ec": "106023",
"sip": "160.141.162.162",
"sport": "53576",
"siface": "inside",
"dip": "160.141.147.78",
"dport": "389",
"proto": "udp",
“bytesin”: 225,
“bytesout”: 550
}

I am a bit new to Elastic Search, We would like to run few aggregations on the above indexed documents.

Our intention is to get the top 10 "dip" fields which are having maximum sum of "bytesin" data.
My query in sql will be like "select dip, sum(bytesin) data from table group by dip order by sum(bytesin) desc".

currently I am running below GET request in Elastic Search but it is not providing what exactly the above sql query gives.

"aggs": {
"top-sip": {
"terms": {
"field": "dip",
"size": 10,
"order": {
"totalbytes": "desc"
}
},
"aggs": {
"totalbytes": {
"sum": {
"field" : "bytesin"
}
}
}
}
}

The above aggregation retrieves the top 10 "dips" with respect to doc_count and then it aggregates the "bytesin" field for those top "dips".

But what I expect(sql query) is, group the "dips" and aggregates the "bytesin" field for the grouped dip's. Finally it retreive the top 10 dips with respect to sum(bytesin). This is a way to override the default aggregation on ‘doc_count’ of a field. For my use-case, identifying the top dips which are involved in consuming highest bandwidth (here bytesin field).

Can any one suggest me how to execute the aggregation to get the expected result.


(system) #2