Vega aggregation - clueless


(Chris Tian) #1

Hi,

I have that default vega sankey visualization diagram which has one term shown on the left and one term shown on the right side. I played with that now half of the day to make another aggregation but failed ultimatively.

The left term is "interface_source", the right term is "interface_destination", so that I can count docs or bytes to see the traffic between all interfaces. The problem is that there is another term which defines the node, because otherwise the "interfaces" are not unique but exist for every node. So I need to group $node with $interface_source for the left side and $node and $interface_destination for the right side.

Has anyone an idea how to do that?

Here the sankey I'm using right now for simple single term visualizations and I would like to enhance:

   {
      $schema: https://vega.github.io/schema/vega/v3.0.json
      "title": {
        "text": {"signal": "'Src/Dst node/interface matrix'"},
        "anchor": "start",
        "frame": "group"
      },
      data: [
        {
          // query ES based on the currently selected time range and filter string
          name: rawData
          url: {
            %context%: true
        %timefield%: @timestamp
        index: netflow*
        body: {
          size: 0
          aggs: {
            table: {
              composite: {
                size: 10000
                sources: [
                  {
                    stk1: {
                      terms: {field: "source-interface"}
                    }
                  }
                  {
                    stk2: {
                      terms: {field: "destination-interface"}
                    }
                  }
                ]
              }
            }
          }
        }
      }
      // From the result, take just the data we are interested in
      format: {property: "aggregations.table.buckets"}
      // Convert key.stk1 -> stk1 for simpler access below
      transform: [
        {type: "formula", expr: "datum.key.stk1", as: "stk1"}
        {type: "formula", expr: "datum.key.stk2", as: "stk2"}
        {type: "formula", expr: "datum.doc_count", as: "size"}
      ]
    }
    {
      name: nodes
      source: rawData
      transform: [
        // when a country is selected, filter out unrelated data
        {
          type: filter
          expr: !groupSelector || groupSelector.stk1 == datum.stk1 || groupSelector.stk2 == datum.stk2
        }
        // Set new key for later lookups - identifies each node
        {type: "formula", expr: "datum.stk1+datum.stk2", as: "key"}
        // instead of each table row, create two new rows,
        // one for the source (stack=stk1) and one for destination node (stack=stk2).
        // The country code stored in stk1 and stk2 fields is placed into grpId field.
        {
          type: fold
          fields: ["stk1", "stk2"]
          as: ["stack", "grpId"]
        }
        // Create a sortkey, different for stk1 and stk2 stacks.
        // Space separator ensures proper sort order in some corner cases.
        {
          type: formula
          expr: datum.stack == 'stk1' ? datum.stk1+' '+datum.stk2 : datum.stk2+' '+datum.stk1
          as: sortField
        }
        // Calculate y0 and y1 positions for stacking nodes one on top of the other,
        // independently for each stack, and ensuring they are in the proper order,
        // alphabetical from the top (reversed on the y axis)
        {
          type: stack
          groupby: ["stack"]
          sort: {field: "sortField", order: "descending"}
          field: size
        }
        // calculate vertical center point for each node, used to draw edges
        {type: "formula", expr: "(datum.y0+datum.y1)/2", as: "yc"}
      ]
    }
    {
      name: groups
      source: nodes
      transform: [
        // combine all nodes into country groups, summing up the doc counts
        {
          type: aggregate
          groupby: ["stack", "grpId"]
          fields: ["size"]
          ops: ["sum"]
          as: ["total"]
        }
        // re-calculate the stacking y0,y1 values
        {
          type: stack
          groupby: ["stack"]
          sort: {field: "grpId", order: "descending"}
          field: total
        }
        // project y0 and y1 values to screen coordinates
        // doing it once here instead of doing it several times in marks
        {type: "formula", expr: "scale('y', datum.y0)", as: "scaledY0"}
        {type: "formula", expr: "scale('y', datum.y1)", as: "scaledY1"}
        // boolean flag if the label should be on the right of the stack
        {type: "formula", expr: "datum.stack == 'stk1'", as: "rightLabel"}
        // Calculate traffic percentage for this country using "y" scale
        // domain upper bound, which represents the total traffic
        {
          type: formula
          expr: datum.total/domain('y')[1]
          as: percentage
        }
      ]
    }
 [...]

(system) closed #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.