This is primarily caused by the order of aggregations. The query TSVB runs is basically this:
must: [{
range: {
@timestamp: { gte, lt }
}
}, {
range: {
field1: { lt: 14400 }
}
}],
aggs: {
group_by: {
terms: { field: "field1", size: 5 },
aggs: {
dates: {
date_histogram: { field: "@timestamp", fixed_interval: "100s" }
}
}
}
}
To put this in plain language, what this query does is:
- Matches documents in Elasticsearch that match both the time range and other filters. From your data, this matches:
{"timestamp":"2021-02-02T09:55:00.000Z","name":"Bravo","field1":100}
{"timestamp":"2021-02-02T09:51:40.000Z","name":"Bravo","field1":300}
{"timestamp":"2021-02-02T09:55:00.000Z","name":"Charlie","field1":100}
{"timestamp":"2021-02-02T09:53:20.000Z","name":"Charlie","field1":200}
{"timestamp":"2021-02-02T09:51:40.000Z","name":"Charlie","field1":300}
-
Finds all the values of
namegiven the documents above. From those documents, it'sBravoandCharlie -
For each value of name, split the documents into date buckets. This results in something like this:
Bravo:
bucket1: {"timestamp":"2021-02-02T09:55:00.000Z","name":"Bravo","field1":100}
bucket2: {"timestamp":"2021-02-02T09:51:40.000Z","name":"Bravo","field1":300}
Charlie:
bucket 1: {"timestamp":"2021-02-02T09:55:00.000Z","name":"Charlie","field1":100}
bucket 2: {"timestamp":"2021-02-02T09:53:20.000Z","name":"Charlie","field1":200}
bucket 3: {"timestamp":"2021-02-02T09:51:40.000Z","name":"Charlie","field1":300}
- Apply the TSVB settings to the split documents above, which shows bucket 3. But even though bucket 3 only matches values from Charlie, Bravo is still a top-level key that exists in your dataset so we show it.
The reason step 4 is not a bug is that we have found that most users want to see all of the named entities in their data, and I believe that we have the right default already. We do sometimes add extra settings to TSVB, and maybe what you are looking for would fall into that category.