Rolling up Data Streams

Hi there,

I'm curious if making rollups from data streams is recommended/possible? It would be great after rolling over data from my data stream if I could also roll it up into bigger time chunks.

Thanks!

It's certainly possible, recommending it or not will depend on what you are trying to achieve :slight_smile:

My goal is to use data streams for my metricbeat data and then after 7 days or so, have the 10s data roll up to 30 min data. I want to have my dashboards to query both the rolled up data as well as the current data.. so when looking at the past 2 weeks, data will show for the entire span and because the buckets will be larger, the 30 min data will still show up just like the more recent data.

From the documentation, it seems that you can only call one rollup at a time which concerns me because I will have data streams from metricbeat that create indices for each module (and I also have several different types of machine that ingest metricbeat that create separate indices). i.e.
metrics-system.cpu-prod-companyA, metrics-system.network-prod-companyA

metrics-system.cpu-prod-companyB, metrics-system.network-prod-companyB

So would I only be able to search either A or B when looking at rollups? Would I need to combine all the different company metrics in the same datastream i.e.

metrics-system.cpu-prod-companyAandcompanyB, system.network-prod-companyAandcompanyB

Also how are the multiple underlying .ds indices dealt with for rollups?

Any thoughts @warkolm ?

bump

Can you share one of the rollup jobs you have configured for this?

PUT _rollup/job/test_rollupA
{
  "id": "test_rollupA",
  "index_pattern": "test-metricbeat-companyA*",
  "rollup_index": "test-metricbeat_companyA-rollup",
  "cron": "0 0 0 * * ?",
  "page_size": 1000,
  "groups": {
    "date_histogram": {
      "interval": "60m",
      "delay": "7d",
      "time_zone": "UTC",
      "field": "@timestamp"
    },
    "terms": {
      "fields": [
        "container.id.keyword",
        "container.image.name.keyword",
        "container.name.keyword",
        "container.runtime.keyword"
      ]
    }
  },
  "metrics": [
    {
      "field": "docker.cpu.system.ticks",
      "metrics": [
        "avg"
      ]
    },
    {
      "field": "docker.cpu.total.norm.pct",
      "metrics": [
        "avg",
        "max",
        "min",
        "sum",
        "value_count"
      ]
    }
  ]
}
PUT _rollup/job/test_rollupB
{
  "id": "test_rollupB",
 "index_pattern": "test-metricbeat-companyB*",
  "rollup_index": "test-metricbeat_companyB-rollup",
  "cron": "0 0 0 * * ?",
  "page_size": 1000,
  "groups": {
    "date_histogram": {
      "interval": "60m",
      "delay": "7d",
      "time_zone": "UTC",
      "field": "@timestamp"
    },
    "terms": {
      "fields": [
        "container.id.keyword",
        "container.image.name.keyword",
        "container.name.keyword",
        "container.runtime.keyword"
      ]
    }
  },
  "metrics": [
    {
      "field": "docker.cpu.system.ticks",
      "metrics": [
        "avg"
      ]
    },
    {
      "field": "docker.cpu.total.norm.pct",
      "metrics": [
        "avg",
        "max",
        "min",
        "sum",
        "value_count"
      ]
    }
  ]
}

Sorry for the delay and thanks for getting back to me. I posted the above for you to see.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.