Vega : Area chart - strange datapoints

Does somebody know why my data points are twisted at some positions? (fig.1)

I use the same data-set in Vega-Lite and it works there. But in Vega I'm missing someting.
Maybe someone can help me out in this regard because Im out of options in where to look for the error.

Datasample:

{
	"score" : 18,
	"time_out" : false

	
	"hits" : {
		"total" : {
			"value" : 0123,
			"relation" : "eq"
		},
		"max_score" : 1.8,
		"hits" : [
			{
				"_index" : "dataIndex",
				"_type" : "log",
				"_source" :{
					"@timestamp" : "2017-01-13345:00:16.0301135Z"
					"numericValue" : 2.0
				}
			},
			{
				"_index" : "dataIndex",
				"_type" : "log",
				"_source" :{
					"@timestamp" : "2017-02-20345:10:16.0301135Z"
					"numericValue" : 3.0
				}
			}
		]
	}
}

Fig.1

Example in Vega - everything works as expected, except some points are skewed

{
  $schema: https://vega.github.io/schema/vega/v5.json
  data: [
    {
      name: table
      url: {
        %context%: true
        %timefield%: @timestamp
        index: dataIndex-*
        body: {
          size: 1000
          aggs: {
            hits: {
              date_histogram: {
                field: @timestamp
                fixed_interval: 10m
                extended_bounds: {
                  min: { %timefilter%: min}
                  max: { %timefilter%: max}
                }
              }
            }
          }
        }
      }
      format: {
        property: hits.hits
      }
      transform: [
        {
          type: formula
          as: varTime
          expr: toDate(datum._source['@timestamp'])
        }
        {
          type: filter
          expr: datum._source['@timestamp'] != null && datum._source['numericValue'] > 0
        }
      ]
    }
  ]
  scales: [
    {
      name: xscale
      type: time
      range: width
      domain: {
        data: table
        field: varTime
      }
    }
    {
      name: yscale
      type: linear
      range: height
      domain: {
        data: table
        field: _source.numericValue
      }
    }
  ]
  axes: [
    {
      orient: bottom
      scale: xscale
      format: %H:%M
    }
    {
      orient: left
      scale: yscale
    }
  ]
  marks: [
    {
      type: area
      from: {
        data: table
      }
      encode: {
        enter: {
          x: {
            scale: xscale
            field: varTime
          }
          y: {
            scale: yscale
            field: _source.numericValue
          }
          y2: {
            scale: yscale
            value: 0
          }
          fill: {
            value: steelblue
          }
        }
        update: {
          fillOpacity: {
            value: 1
          }
        }
        hover: {
          fillOpacity: {
            value: 0.5
          }
        }
      }
    }
  ]
}

Hi @Ganymede,

It looks to me like you are using the query results instead of the aggregations (you are referring to hits.hits, fields inside _source in the transform, and the size is > 0 (when using aggs you should use size: 0 to reduce the payload, since it only affects to the number of raw documents returned by the query).

That said, I think you've got 2 options:

  1. Keep your vega definition the same way (relying on the hits.hits): Then I think you'll need to sort your documents by @timestamp (simply add "sort": [ { "@timestamp": { "order": "desc" } } ], to the body.
  2. (Recommended): redefine your Vega definition to use the results from aggregations.hits.buckets.
1 Like

From what I experience your suggestion seems about right. I tried to implement some sort on the time axis but failed so far because I tried it on the transform.

I will try your snippet. Thank you so much!

Regarding point 2:
As far as I understand it: Bucket aggregation is the novel approach and should be the go-to method, as you suggested. Although I struggle to make it work outside of the classic tutorial with document count & timestamp.

I believe I need a subbucket aggregation inside aggs:{...aggs:{...}} or so. I'll look more into this approach. I found a tutorial today that, kind of, goes into the direction of logs/ES/Kibana & bucket aggregation.

Thank you so much for pointing me into the right direction :slight_smile:


Edit:
It really was a missing sort in the body on the timestamp. I never thought this would be an issue. Thank you again :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.