Do Time series data streams (TSDS) store non-numeric/non-dimension logs efficiently?

micheal_riff · October 3, 2023, 10:40pm

Hi,

I have a few questions about time series data streams (TSDS). I've tried looking through the documentation but am still confused on a few small things

I was wondering if fields that are not stored as dimensions ("time_series_dimension": true) or as a metric (ex: "time_series_metric": "gauge") were still stored more efficiently than in a normal datastream?

I.e. if I have the mappings

{
      "properties": {
        "Dimension1": {
          "type": "keyword",
          "time_series_dimension": true
        },
        "Dimension2": {
          "type": "keyword",
          "time_series_dimension": true
        },
        "metric1": {
          "type": "integer",
          "time_series_metric": "gauge"
        },
        "other": {
          "type": "keyword"
        }
        "@timestamp": {
          "type": "date",
          "format": "strict_date_optional_time"
        }
      }
}

Would the "other" keyword be stored more efficiently than just using a normal data stream? If they are not stored as efficiently, then should I make every non-numeric field into a dimension? In this example "other" keyword would be a keyword that will only have 5 possible values - and I would like to keep count of logs with those values over time.

Additionally I had one other question. I have some logs I'm turning into metrics in which some are going to have the exact same dimensions and @timestamp. I know that TSDS stores _id as a hash of dimensions and @timestamp - but this poses a problem that it is counting some of my logs (around every 1000th) as duplicates as they have the same @timestamp (to the millisecond) and dimensions. Is there any solution that allows me to prevent the TSDS from automatically preventing duplicate documents besides hard coding a solution into a dimension through something like logstash?

Thanks!

system · October 31, 2023, 10:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using Time Serie DataStream (TSDS) for discrete events with high cardinality Elasticsearch datastreams	1	264	May 22, 2023
How to take profit of compression for a Time Serie Data Stream Elasticsearch datastreams	5	305	November 8, 2023
Data Streams vs "Traditionally" Elastic Indexes Elasticsearch	7	3199	October 10, 2022
Ingesting old data to a TSDS - Problems? Elasticsearch datastreams	0	13	October 21, 2024
Anyone have an easy way to make Metricbeat use Time Series Data Streams (TSDS)? Beats metricbeat	5	559	October 10, 2023

Do Time series data streams (TSDS) store non-numeric/non-dimension logs efficiently?

Related topics