Using Time Serie DataStream (TSDS) for discrete events with high cardinality

Marchelune · April 24, 2023, 4:57pm

Hi there!
I'm new to TSDS and time series in general. Let's say I have the following index mapping:

{
    "properties": {
        "@timestamp": {
            "type": "date"
        },
        "game_id": {
            "time_series_dimension": true,
            "type": "keyword"
        },
        "session": {
            "properties": {
                "id": {
                    "time_series_dimension": true,
                    "type": "keyword"
                }
            }
        },
        "event": {
            "properties": {
                "occurrences_count": {
                    "time_series_metric": "gauge",
                    "type": "long"
                },
                "time_spent_in_seconds": {
                    "time_series_metric": "gauge",
                    "type": "long"
                },
                "action": {
                    "type": "keyword"
                }
            }
        }
    }
}

I have many games that users play on their devices, and there are various events I want to store to analyse their behaviour.

I'd enjoy having a bit more guidance from the documentation on the TSDS use-cases and how to design a data pipeline that uses TSDS.
For instance, it was not clear when to use the gauge or counter metrics (at least for me), and I ran into the issue of not understanding why I couldn't get a sum aggregation over all my event_count (which I typed with the time_series_metric set to counter). Thanks to this comment I think I got a better understanding, and my "occurrences_count" should actually be a gauge, if I'm not mistaken.

Similarly, it is not crystal clear for me what to choose as a dimension, especially in the scenario where one dimension may have values with a high cardinality.

The doc makes sense when you think of temperature or CPU usage sensors, but would you recommend TSDS for discrete events with high cardinality on some dimensions? Or should a regular datastream with timestamp be the preferred solution?

system · May 22, 2023, 4:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Do Time series data streams (TSDS) store non-numeric/non-dimension logs efficiently? Elasticsearch	1	167	October 31, 2023
Ingesting old data to a TSDS - Problems? Elasticsearch datastreams	0	13	October 21, 2024
Anyone have an easy way to make Metricbeat use Time Series Data Streams (TSDS)? Beats metricbeat	5	559	October 10, 2023
How to take profit of compression for a Time Serie Data Stream Elasticsearch datastreams	5	305	November 8, 2023
Type of Long not valid in time series dimension Elasticsearch datastreams	5	390	May 4, 2023

Using Time Serie DataStream (TSDS) for discrete events with high cardinality

Related topics