Vega scatter plot based on array of points in document

qd-danh · May 20, 2022, 4:53am

I have been working through simple scatter plots, boxplots in Kibana with Vega. Now I'm getting to what I actually need to render.

Each document from elastic has an array of points. From a query, I will likely have up to 10 documents. Ideally what I'm after is an x,y scatter plot for each document and then layer those on top of each other.

Some of the features / docs I have been looking into:

Layering
Repeating - although this looks like it's one chart per PROPERTY on a document
Flatten transform - but this seems to work on single element arrays and makes objects out of them. I already have objects with X, Y in each.

I'm struggling to come up with an approach, or to know if Kibana + Vega can do this.

One layer per document.
Each layer is a scatter plot (data series based on x,y object array in the document)

This is a crude rendering from Excel that hopes to get the point across (3 series are from each of the 3 documents in the sample data below)

Here is an example of my data

{
    "values" : [
        {
            "_id" : "3gvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-31T18:26:27",
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 100
                    },
                    { 
                        "x" : 1,
                        "y" : 120
                    },
                    { 
                        "x" : 2,
                        "y" : 105
                    },
                    { 
                        "x" : 3,
                        "y" : 108
                    },
                    { 
                        "x" : 4,
                        "y" : 117
                    }
                ]
            }
        },
        {
            "_id" : "3wvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-31T18:26:27",
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 98
                    },
                    { 
                        "x" : 1,
                        "y" : 105
                    },
                    { 
                        "x" : 2,
                        "y" : 110
                    },
                    { 
                        "x" : 3,
                        "y" : 115
                    },
                    { 
                        "x" : 4,
                        "y" : 113
                    }
                ]
            }
        },
        {
            "_id" : "4AvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-31T18:26:27",
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 115
                    },
                    { 
                        "x" : 1,
                        "y" : 120
                    },
                    { 
                        "x" : 2,
                        "y" : 113
                    },
                    { 
                        "x" : 3,
                        "y" : 122
                    },
                    { 
                        "x" : 4,
                        "y" : 130
                    }
                ]
            }
        }
    ]
}

I do have an ability to transform the JSON into some other format if it will be easier to plot in Vega(-lite) as desired.

Clear as mud? Thank you in advance for any brainstorming thoughts.

Tomo_M · May 21, 2022, 10:15am

I suppose you need Flatten Transform. It is possible with Kibana + Vega. You may use Vega-lite if you prefer.

With flatten transform, you may transform your data into:

"_id": "3w..", "x": 0, "y": 98
"_id": "3w..", "x": 1, "y": 105
"_id": "3w..", "x": 2, "y": 110
...
"_id": "3g..", "x": 0, "y": 100
"_id": "3g..", "x": 1, "y": 120
...

After transform your data, Colored Scatterplot example may help you create your script.

(Your shown plots is not a scatter plots. Isn't it a line plot? Which do you need?)

qd-danh · May 23, 2022, 5:19am

Thank you for the help @Tomo_M.

I dug a little deeper into Flatten transform - docs only show simple array with integers, so wasn't sure what it would do with the x & y properties in an object. Seems to work great! As you illustrate, I get a repeated copy of the original document for each of the array elements. Allows me to plot the X,Y data and then used nominal encoding to color by document / id.

You are right, I was fluctuating between line chart and scatter plots in my samples. Sorry for the confusion.

Here is my spec in case it helps others (works in online vega editor).

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",

  "width": 600,
  "height" : 200,

  "data": {
    "name": "theData",
    
    "values" : [
        {
            "_id" : "3gvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-29T18:26:27",
                "id" : 111,
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 100
                    },
                    { 
                        "x" : 1,
                        "y" : 60
                    },
                    { 
                        "x" : 2,
                        "y" : 105
                    },
                    { 
                        "x" : 3,
                        "y" : 138
                    },
                    { 
                        "x" : 4,
                        "y" : 117
                    }
                ]
            }
        },
        {
            "_id" : "3wvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-30T18:26:27",
                "id" : 222,
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 98
                    },
                    { 
                        "x" : 1,
                        "y" : 135
                    },
                    { 
                        "x" : 2,
                        "y" : 110
                    },
                    { 
                        "x" : 3,
                        "y" : 90
                    },
                    { 
                        "x" : 4,
                        "y" : 113
                    }
                ]
            }
        },
        {
            "_id" : "4AvR138B1LivDftAs_NA",
            "_score" : 1.0,
            "_source" : {
                "date" : "2022-01-31T18:26:27",
                "id" : 333,
                "points" : [
                    { 
                        "x" : 0,
                        "y" : 115
                    },
                    { 
                        "x" : 1,
                        "y" : 95
                    },
                    { 
                        "x" : 2,
                        "y" : 110
                    },
                    { 
                        "x" : 3,
                        "y" : 130
                    },
                    { 
                        "x" : 4,
                        "y" : 130
                    }
                ]
            }
        }

    ]

  },
 
  "transform": [
    {
      "flatten": [
        "_source.points"
      ], "as" : [
        "points"
      ]
    }
  ],
  
  "mark": {
    "type": "line",
    "interpolate": "basis"
  },
  
  "encoding": {
  
    "x": {
      "field": "points.x",
      "type": "quantitative",
      "title" : "index"
    },
    
    "y": {
      "field": "points.y", 
      "type": "quantitative",
      "title" : "value"
    },

    "color" : {
      "field" : "_source.id",
      "type" : "nominal",
      "title" : "device id"
    }

  }
}

Here is the output

Layering
I have pretty much given up on the layering idea. Don't think it's possible to create dynamic layer per document, but with the nominal encoding to color them, I don't think I need the layering. I will probably add a filtering mechanism in Kibana dashboard.

Thanks again for the help. I will continue pushing along...

Tomo_M · May 23, 2022, 5:25am

Thank you for sharing your specification!

qd-danh · June 2, 2022, 5:16am

To more accurately test my scenario, I have updated my data to test with, to more correctly match my real data in the Elastic index. My data has two levels of arrays, and I got it all working with "double flatten" transformations - a flatten for each level of the arrays, which generates another document at the root for each of the lines and points in the lines. Thought I would post the working spec in case it helps someone else with this data layout.

I'm using inline data for now, to have this working in the Vega Editor.

Outer array represents a line series. Inner array represents the points of the line. Use calculate to compute a line ID and use that to group & color them with nominal encoding type.

Resulting graph is shown here:

Hope that helps someone out there...

system · June 30, 2022, 5:16am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Visualization of array vs array from a single document Kibana	5	3080	October 15, 2020
Scatter Plot with multiple Y values for each X using VEGA Kibana	2	1172	November 21, 2019
Vega array coloring (scatterplot) Kibana	7	612	December 18, 2019
Vega Boxplot Kibana	2	3277	April 6, 2018
Create a Boxplot with Elastic Index Kibana vega	2	57	September 13, 2024

Vega scatter plot based on array of points in document

Related topics