Resource estimation for calculating Max TPS

Dear Elastic Community,

Need help regarding calculating system resource for EFK stack to calculate Max TPS.

Current Setup:
Environment: Kubernetes Cluster
Elasticsearch:

  • Version: 7.17.3
  • 6 Pods/Containers
  • Each Pod has 2Core CPU, 4G Memory allocated.
  • ES_JAVA_OPTS on each pod (-Xmx2g / -Xms2g). Followed this as reference.

Kibana:

  • Version: 7.17.3
  • 1 Pod/Container having 1Core CPU, 2G Memory

Index Template Settings:

  • Number of Shards: 8
  • Number of Replicas: 2

Scenario:
From application logs I need to calculate MAX TPS (Transactions Per Second). Every transaction contains multiple log entries, uniquely identified by ID field in logs. I'm using Aggregation based data table. Every row contains MAX TPS of a Day. I'm aiming to calculate MAX TPS for 3 to 5 days. Using {"fixed_interval" : "1s"} for accuracy.

Problem Statement:
Whenever trying to calculate Max TPS for longer time frames (like 3 days, 7 days), kibana usually throws one of the following errors:

  • Data too large
  • Bucket size too large
  • Shards Failed

For application/services having less traffic/logs, 1 to 3 days Max TPS can calculate with this setup but for application/service having more traffic/logs, even calculating Max TPS for the whole day (24 hours) results in one of the above-mentioned errors.

Question:
How can I calculate MAX TPS for longer time frames? How can I calculate the resource needed for this scenario? Is there any thumb rule for this to estimating assigned resources for this?

Considering this for both Production and UAT environment. Any suggestions or similar case examples are appreciated.

data-too-large

Hi @Lisa_Jung,

Can you please guide me?