Simple scatter plot - infinite extents when using transform

qd-danh · May 5, 2022, 12:06am

New to Kibana + Vega and going through lots of tutorials, landed on this video to try to make a simple scatter plot. Using the sample eCommerce data.

I am so close, but seemingly something slightly wrong. Comparing what I have to the docs and not seeing what I'm doing wrong. I have been adjusting what I see in the video to what seems like recent versions of vega / kibana based on docs and other examples.

My understanding - after the transform, I should have a calculated field called time, and trying to use that in the encoding section throws error

{
  calculate: "toDate(datum.order_date)"
  as: "time"
}

. . .

encoding: {
  x: {
    field: time
    type: temporal
    axis: { title: false }
  }
  
  y: { 
    field: total_quantity
    type: quantitative
    axis: { title: "Order Quantity" }
  }
  
}

Infinite extent for field "time": [Infinity, -Infinity]

I have tried things like _source.time, datum.time, ... I'm just not getting why the "time" property is not present (also looked through the debugging techniques in the "inspect" panel and the F12 browser dev tools and not seeing the calculated time property.

here is my full spec file

Thank you in advance for any help!

jsanz · May 5, 2022, 3:32pm

I see several errors:

The format key needs to go one level out of the url key.
The full values key is the response to the query and needs to be removed.
The calculate needs to refer to the datum._source['order_date']

This reduced spec works:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "mark": "point",
  "data": {
    "url": {
      "%context%": true,
      "%timefield%": "order_date",
      "index": "kibana_sample_data_ecommerce",
      "body": {"size": 10, "_source": ["order_date", "total_quantity"]},
    },
    "format": {"property": "hits.hits"}
  },
  "transform": [
    {"calculate": "toDate(datum._source['order_date'])", "as": "time"}
  ],
  "encoding": {
    "x": {"field": "time", "type": "temporal", "axis": {"title": false}},
    "y": {
      "field": "_source.total_quantity",
      "type": "quantitative",
      "axis": {"title": "Order Quantity"}
    }
  }
}

I took a look to compare to the scatter plot visualization from the logs sample dataset bundle

qd-danh · May 5, 2022, 6:45pm

Big thank you Jorge for the fixes! Had a feeling they were small issues in formatting or understanding. Some follow up thoughts, perhaps to help others that come across this thread in the future:

Where is that scatter plot sample that you screenshot? I have the web logs sample data imported into my elastic / kibana instance, and see a few other sample visualizations (screenshot below), but not that scatter plot.

After seeing calculate working with your code, I was playing around with alternatives to the syntax that I've seen around the web. Looks like both toDate(datum._source.order_date) works as well as the array/indexer syntax you provide ( toDate(datum._source['order_date']) ) . Any difference?

The values key in the spec I provided. My understanding was to post the full spec, which includes data values instead of the query to Elastic. This way others can "run" the spec as is in something like the Vega editor. If asking for help about data that others wouldn't have or be able to repo, it's a stand-alone way to run the visualization. Totally understand that the values key is not used / wanted in the spec as written in the vega/kibana visualization spec. FYI - saw the steps here suggesting to post the full spec.

Huge thanks again for getting me over the hurdles.

jsanz · May 6, 2022, 1:25pm

To be honest no idea when I added the sample data on my testing cluster, but as you, I've just re-added it in my 8.2.0 instance and that chart is gone in the new Logs dashboard.

Anyways, sharing it here for future reference.

{
  $schema: "https://vega.github.io/schema/vega-lite/v4.json"
  // Use points for drawing to actually create a scatterplot
  mark: point
  // Specify where to load data from
  data: {
    // By using an object to the url parameter we will
    // construct an Elasticsearch query
    url: {
      // Context == true means filters of the dashboard will be taken into account
      %context%: true
      // Specify on which field the time picker should operate
      %timefield%: timestamp
      // Specify the index pattern to load data from
      index: kibana_sample_data_logs
      // This body will be send to Elasticsearch's _search endpoint
      // You can use everything the ES Query DSL supports here
      body: {
        // Set the size to load 10000 documents
        size: 10000,
        // Just ask for the fields we actually need for visualization
        _source: ["timestamp", "bytes", "extension"]
      }
    }
    // Tell Vega, that the array of data will be inside hits.hits of the response
    // since the result returned from Elasticsearch fill have a format like:
    // {
    //   hits: {
    //     total: 42000,
    //     max_score: 2,
    //     hits: [
    //       < our individual documents >
    //     ]
    //   }
    // }
    format: { property: "hits.hits" }
  }
  // You can do transformation and calculation of the data before drawing it
  transform: [
    // Since timestamp is a string value, we need to convert it to a unix timestamp
    // so that Vega can work on it properly.
    {
      // Convert _source.timestamp field to a date
      calculate: "toDate(datum._source['timestamp'])"
      // Store the result in a field named "time" in the object
      as: "time"
    }
  ]
  // Specify what data will be drawn on which axis
  encoding: {
    x: {
      // Draw the time field on the x-axis in temporal mode (i.e. as a time axis)
      field: time
      type: temporal
      // Hide the axis label for the x-axis
      axis: { title: false }
    }
    y: {
      // Draw the bytes of each document on the y-axis
      field: _source.bytes
      // Mark the y-axis as quantitative
      type: quantitative
      // Specify the label for this axis
      axis: { title: "Transferred bytes" }
    }
    color: {
      // Make the color of each point depend on the _source.extension field
      field: _source.extension
      // Treat different values as completely unrelated values to each other.
      // You could switch this to quantitative if you have a numeric field and
      // want to create a color scale from one color to another depending on that
      // field's value.
      type: nominal
      // Rename the legend title so it won't just state: "_source.extension"
      legend: { title: 'File type' }
    }
    shape: {
      // Also make the shape of each point dependent on the extension.
      field: _source.extension
      type: nominal
    }
  }
}

No difference, for properties that are valid identifiers it's fine to use the dot notation. More details here.

qd-danh · May 6, 2022, 10:59pm

Thanks for posting the spec for that logs visual. Filed it away for safekeeping.

Thanks again for the help in general.

system · June 3, 2022, 10:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Infinite extent for field "times": [Infinity, -Infinity] when data size set as 0 or larger than 10000 Kibana vega	2	1065	January 11, 2022
Vega Infinite extent for field error Kibana vega	1	1218	February 3, 2022
Infinite extent for field "count": [Infinity, -Infinity] Kibana vega	3	2036	July 19, 2021
Can't Get Timestamp Transform To Work For A Basic Vega Scatter Plot Kibana vega	8	864	March 18, 2021
Infinite extent for field "<field name>" : [Infinity, -Infinity] Kibana vega	6	701	February 16, 2023

Simple scatter plot - infinite extents when using transform

Related topics