Hi all,
I'm trying to come up with a query that results in a table that is similar to:
SourceCity | dstCity | dstCountry | dstCountryISO | total sum of flows | total data transferred
cityA | cityB | countryB | CB | 15 | 3GB
cityA | cityC | countryB | CB | 25 | 2GB
cityB | cityD | countryD | CD | 125 | 12GB
The records in ES are similar to the below (pruned other data that isn't required)
{
"_index" : "flows-2021.07.03-000001",
"_type" : "_doc",
"_id" : "flows-2021.07.03+0+82",
"_score" : 1.0,
"_source" : {
"startTime": (unix timestamp)
"srcCountry" : "CountryA",
"srcCountryISO": "CA",
"srcCity" : "cityA",
"srcLatLon": "lat,lon",
"dstCountry" : "CountryB",
"dstCountryISO" : "CB",
"dstCity" : "cityB",
"dstLatLon": "lat,lon",
"srcAddr" : "192.168.100.1",
"dstAddr" : "104.211.26.226"
}
},
How do I aggregate flows based on timestamp (eg, in the last 24hours) and unique destiations, and get the following: dstCountry, dstCountryISO, dstCity, dstLatLon
I tried having nested aggregations of dstCity, dstCountry, dstCountryISO and I found it to be very combursome since the results come in the same order of nesting. I'm sure there is a better and elegant "right" way to query. Something that my current knowledge on ES queries isn't helping me with. Some pointers to this would be useful!
Thanks!