An example of how you can access the data from ES is shown in your "transform".
The calculate
transform gives the value toDate(datum._source['@timestamp'])
to a field (or column header) called time
.
Let's break this down. toDate( )
is just a function that creates a date object initialized by whatever value datum._source['@timestamp']
may be.
This datum
is a special Vega term. It is the data that you defined in data
. In your case, it's the JSON object that Elasticsearch returns from your url
parameter.
If you actually run the content of the url
parameter, in for example DevTools, you'll get to see what ES returns.
GET postgresql*/_search
{
"size": 10000,
"_source": ["@timestamp", "beat.name", "postgresql.database.rows.inserted"]
}
The response should look like this:
{
"took" : ...,
"timed_out" : false,
"_shards" : { ... },
"hits" : {
"total" : ...,
"max_score" : 1.0,
"hits" : [
{
"_index": "postgresql",
"_type": "_doc",
"_id": "....",
"_score": 1.0,
"_source": {
"@timestamp": "2019-05-29T04:12:53.318Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 83977835
}
}
},
"beat" : {
"name" : "some_host"
}
}
},
{
"_index": "postgresql",
"_type": "_doc",
"_id": "....",
"_score": 1.0,
"_source": {
"@timestamp": "2019-05-29T05:20:12.148Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 12
}
}
},
"beat" : {
"name" : "some_other_host"
}
}
},
...
]
}
}
In your data
definition, you probably also have format: {"property: hits.hits"}
somewhere in there. That datum
variable is now pointing to an array, exactly the array of your hits.hits
:
[
{
"_index": "postgresql",
"_type": "_doc",
"_id": "....",
"_score": 1.0,
"_source": {
"@timestamp": "2019-05-29T04:12:53.318Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 83977835
}
}
},
"beat" : {
"name" : "some_host"
}
}
},
{
"_index": "postgresql",
"_type": "_doc",
"_id": "....",
"_score": 1.0,
"_source": {
"@timestamp": "2019-05-29T05:20:12.148Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 12
}
}
},
"beat" : {
"name" : "some_other_host"
}
}
},
...
]
By saying datum._source
you narrow that down to:
[
{
"@timestamp": "2019-05-29T04:12:53.318Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 83977835
}
}
},
"beat" : {
"name" : "some_host"
}
},
{
"@timestamp": "2019-05-29T05:20:12.148Z",
"postgresql" : {
"database" : {
"rows" : {
"inserted" : 12
}
}
},
"beat" : {
"name" : "some_other_host"
}
},
...
]
and further, datum._source['@timestamp']
gives us this array of a bunch of date values:
[
"2019-05-29T04:12:53.318Z",
"2019-05-29T05:20:12.148Z",
...
]
At this point, imagine that Vega has created a table. The table has one column with a column header called time
(which is what you called it using calculate
). The rows have the values: toDate("2019-05-29T04:12:53.318Z")
then toDate("2019-05-29T05:20:12.148Z")
etc.
You refer to this column in your encoding
by saying that you want your x-axis to use the values from that column time
.
You can create another column by adding more to your transform.
transform: [
{
calculate: "toDate(datum._source['@timestamp'])"
as: "time"
},
{
calculate: "datum._source.postgresql.database.rows.inserted"
as: "no_of_rows"
}
]
This gives you two columns: time
and no_of_rows
. By the way, the syntax _source['...']
was used instead of the usual dot notation only because the compiler wasn't happy to see the @
character in the dot notation. The work around is to use [' ... ']
.
So now you can write your encoding using two sets of values.
encoding: {
x: {
field: time
type: temporal
axis: {title: "Date"}
}
y: {
field: no_of_rows
type: quantitative
axis: {title: "Number of rows inserted"}
}
}