Hi there.
I'd like to create a Vega Sankey visualization correctly sorted.
In fact, I was making such a visualization about traffic from_ip -> to_ip.
Problem is, the most frequent ips in the Sankey are not correct. Comparing them to the ones appearing in a table, to real top ones are not shown in the Sankey.
After some research I found out the composite aggs
(used by the Sankey) is not perfectly sorted since it'd be way too heavy in computation.
In fact, running in Dev Tools the same aggs used in the Vega url to build the Sankey, results are absolutely not sorted by doc_count.
Now, what might be a proper solution to this problem (apart from not using Vega Sunkey)?
Also, I tried simply getting the top results using a sized query, but it didn't solve the problem. Ideas?
Here's my query:
GET my_index*/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"exists": {
"field": "from_ip"
}
}
]
}
},
"aggs": {
"table": {
"composite": {
"size": 50,
"sources": [
{
"stk1": {
"terms": {
"field": "from_ip.ip",
"order" : "desc"
}
}
},
{
"stk2": {
"terms": {
"field": "to_ip.ip",
"order" : "desc"
}
}
}
]
}
}
}
}