How to pick the previous value when current value is undefined?

Screenshot_2019-04-22%20Vega%20Editor
I want to pick the previous value when my current value is undefined. The below code works fine when only 1 value is missing. But when multiple consecutive values are missing in my dataset, I cannot replace them with the any previous value that is available. For example, in the below dataset, the value of y is missing when x=3 and c=0. So it picks the data from x=2 and c=0 and replaces the missing value with y=51. But when multiple consecutive values of y is missing for c=0 and x=6, 7 and 8, only the value of y for x=6 is being replaced by 24, i.e., the value of y when x=5. I want y to be assigned 24 for x=7 and 8 as well. Any help would be appreciated. Thanks in advance!

{
 "$schema": "https://vega.github.io/schema/vega/v5.json",
 "width": 500,
 "height": 200,
 "padding": 5,

"signals": [
{
  "name": "undefined",
  "value": undefined
}
],

"data": [
{
  "name": "table",
  "values": [
    {"x": 0, "y": 28, "c":0}, {"x": 0, "y": 20, "c":1},
    {"x": 1, "y": 43, "c":0}, {"x": 1, "y": 35, "c":1},
    {"x": 2, "y": 51, "c":0}, {"x": 2, "y": 10, "c":1},
    {"x": 3,  "c":0}, {"x": 3, "y": 15, "c":1},
    {"x": 4, "y": 52, "c":0}, {"x": 4, "y": 48, "c":1},
    {"x": 5, "y": 24, "c":0}, {"x": 5, "y": 28, "c":1},
    {"x": 6, "c":0}, {"x": 6, "y": 66, "c":1},
    {"x": 7,  "c":0}, {"x": 7, "y": 27, "c":1},
    {"x": 8,  "c":0}, {"x": 8, "y": 16, "c":1},
    {"x": 9, "y": 49, "c":0}, {"x": 9, "y": 25, "c":1}
  ],"transform": [{"type": "window", "frame": [
    1,
    0
  ], "groupby": ["c"], "fields": ["y"], "ops":["first_value"]
}, {"type": "formula", "expr": "if(datum['y']===datum[undefined],datum['first_value_y'],datum['y'])", "as": "y"}
]}],

"scales": [
{
  "name": "x",
  "type": "point",
  "range": "width",
  "domain": {"data": "table", "field": "x"}
},
{
  "name": "y",
  "type": "linear",
  "range": "height",
  "nice": true,
  "zero": true,
  "domain": {"data": "table", "field": "y"}
},
{
  "name": "color",
  "type": "ordinal",
  "range": "category",
  "domain": {"data": "table", "field": "c"}
}
],

"axes": [
{"orient": "bottom", "scale": "x"},
{"orient": "left", "scale": "y"}
],

"marks": [
{
  "type": "group",
  "from": {
    "facet": {
      "name": "series",
      "data": "table",
      "groupby": "c"
    }
  },
  "marks": [
    {
      "type": "line",
      "from": {"data": "series"},
      "encode": {
        "enter": {
          "x": {"scale": "x", "field": "x"},
          "y": {"scale": "y", "field": "y"},
          "stroke": {"scale": "color", "field": "c"},
          "strokeWidth": {"value": 2}
        },
        "update": {
          "fillOpacity": {"value": 1}
        },
        "hover": {
          "fillOpacity": {"value": 0.5}
        }
      }
    },
   {
      "type": "symbol",
      "from": {"data": "series"},
      "encode": {
        "enter": {
          "x": {"scale": "x", "field": "x"},
          "y": {"scale": "y", "field": "y"},
          "fill": {"scale": "color", "field": "c"}
        },
        "update": {
          "fillOpacity": {"value": 1}
        },
        "hover": {
          "fillOpacity": {"value": 0.5}
        }
      }
    }
  ]
}
]
 }

hey @fernsrea
I think you should use the impute transform of vega for doing that: https://vega.github.io/vega-lite/docs/impute.html

Hi, Thank you for your reply. Could you please show me with an example the use of impute to pick the previous value when there are multiple consecutive missing value?

Hey @nyuriks can you help @fernsrea on this?
I've dig a bit more into the impute transform but seems to allow only few methods like value , mean , median , max or min but nothing that is like carry the last non-null value. Do you have any idea?

1 Like

It's an interesting idea for impute, I added a feature request.

In the mean time, I can't really think of a good way to solve it (you could also try asking at the Vega forums - your question is not specific to to ElasticSearch, so just link to this discussion or copy/paste your code).

One option would be to simply remove the datums with the missing values, and rely on the drawing to visually hide it. You can even use "step" interpolation (or step-after/step-before) to make it perfectly accurate. Also note the tooltip in the second mark to simplify inspection, and also when used in Kibana, I would recommend to put everything from "enter" section into the "update" to avoid drawing artifacts.

image
image

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "width": 500,
  "height": 200,
  "padding": 5,
  "data": [
    {
      "name": "table",
      "values": [
        {"x": 0, "y": 28, "c": 0},
        {"x": 0, "y": 20, "c": 1},
        {"x": 1, "y": 43, "c": 0},
        {"x": 1, "y": 35, "c": 1},
        {"x": 2, "y": 51, "c": 0},
        {"x": 2, "y": 10, "c": 1},
        {"x": 3, "c": 0},
        {"x": 3, "y": 15, "c": 1},
        {"x": 4, "y": 52, "c": 0},
        {"x": 4, "y": 48, "c": 1},
        {"x": 5, "y": 24, "c": 0},
        {"x": 5, "y": 28, "c": 1},
        {"x": 6, "c": 0},
        {"x": 6, "y": 66, "c": 1},
        {"x": 7, "c": 0},
        {"x": 7, "y": 27, "c": 1},
        {"x": 8, "c": 0},
        {"x": 8, "y": 16, "c": 1},
        {"x": 9, "y": 49, "c": 0},
        {"x": 9, "y": 25, "c": 1}
      ],
      "transform": [
        {"type": "filter", "expr": "datum.y || datum.y === 0"}
      ]
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "point",
      "range": "width",
      "domain": {"data": "table", "field": "x"}
    },
    {
      "name": "y",
      "type": "linear",
      "range": "height",
      "nice": true,
      "zero": true,
      "domain": {"data": "table", "field": "y"}
    },
    {
      "name": "color",
      "type": "ordinal",
      "range": "category",
      "domain": {"data": "table", "field": "c"}
    }
  ],
  "axes": [
    {"orient": "bottom", "scale": "x"},
    {"orient": "left", "scale": "y"}
  ],
  "marks": [
    {
      "type": "group",
      "from": {"facet": {"name": "series", "data": "table", "groupby": "c"}},
      "marks": [
        {
          "type": "line",
          "from": {"data": "series"},
          "encode": {
            "enter": {
              "x": {"scale": "x", "field": "x"},
              "y": {"scale": "y", "field": "y"},
              "stroke": {"scale": "color", "field": "c"},
              "strokeWidth": {"value": 2},
              "interpolate": {"value": "step-after"}
            },
            "update": {"fillOpacity": {"value": 1}},
            "hover": {"fillOpacity": {"value": 0.5}}
          }
        },
        {
          "type": "symbol",
          "from": {"data": "series"},
          "encode": {
            "enter": {
              "x": {"scale": "x", "field": "x"},
              "y": {"scale": "y", "field": "y"},
              "fill": {"scale": "color", "field": "c"},
              "tooltip": {"signal": "datum"}
            },
            "update": {"fillOpacity": {"value": 1}},
            "hover": {"fillOpacity": {"value": 0.5}}
          }
        }
      ]
    }
  ]
}
1 Like

P.S. Vega team is clearly super efficient. This new feature has already been implemented and merged, and will be released in the next version of Vega. We now just need to upgrade Kibana to use it :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.