Need help reducing the volume of data

Hi there!

I am trying to reduce the volume of my data in metrics monitoring. Currently, I filter out fields using Logstash and retain only the fields that I need for my dashboards.

I have 7 servers, and each server generates 204 MB of data per day. So, for 7 servers, that totals 1.4 GB per day.

Is there a way to further reduce the data? See the picture below; these are the fields I have retained for the dashboards.

Hi,

This is a very nice question.

From my understanding you already are using default tools and tried to delete as much fields as you could, now will come your indexing strategy and how much nodes do you have ?

Can you increase the metric frequency, let's say 1 log per minute instead of 30 ?

How much precision do you need to have ?

How much retention do you need to have ?

You can also trim much more metadata fields if you dont need them.

Are you using Time series data stream (TSDS) | Elasticsearch Guide [8.15] | Elastic? and specifically the time_series index mode?

What is your version?

Hi,

I just tried to increase the metric frequency. Previously, I had set it to 10 seconds. Now, I am monitoring it to see if it reduces the data.

I would like to retain the data for 365 days before deleting it.

However, I’m not sure what else I need to filter out because the remaining fields are necessary for the dashboard.

For my setup, I am using a single node for this small deployment.

this is my metrics configuration

- module: system
  period: 30s
  metricsets:
    - cpu
    - memory
    - network
  process.include_top_n:
    by_cpu: 5      # include top 5 processes by CPU
    by_memory: 5   # include top 5 processes by memory

- module: system
  period: 2m
  metricsets:
    - filesystem
    - fsstat
  processors:
  - drop_event.when.regexp:
      system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'

- module: system
  period: 2m
  metricsets: ["diskio"]
  processors:
  - drop_event.when.regexp:
      system.diskio.name: '^sr0$'

- module: system
  period: 2m
  metricsets: ["process"]
  processors:
  - drop_event.when.regexp:
      system.process.name: '^.*(kworker|ksoftirqd|rcu|watchdog|migration|kthread|rcu_sched|systemd|agetty|auditd|sshd|bash|ksmd|lvmetad|scsi|khungtaskd|jbd2|kblockd|bioset|dbus|khelper|kmpath|kintegrityd|khugepaged|fsnotify|ata_sff|LCPDEV|perf|crond|irqbalance|kdevtmpfs|writeback|deferwq|kswapd|kthrotld|kpsmoused|tail|vballoon|polkitd|metricbeat|ttm_swap|syslog|udp_rcv).*'

Thank you for your response! :slight_smile:

Hi @dadoonet ,

I am using the regular data stream.

The version I am using is 8.9.0.

Thank you for your response. :slight_smile: