Elasticsearch resource calculation

Ibrahim_Can_Duran · August 29, 2023, 2:08pm

I am trying to calculate the Resource requirements for an ELK system which will be deployed on k8s.

The total load will be 4 TB and i will use 1 replica. Is it possible to have equations for required number of shard and required vCPU, considering that i will use hot shards for all data and will set to 50 GB.

sholzhauer · August 29, 2023, 2:14pm

Hi,

I'd advice you to have a look at the answer I posted in this thread.

More details will help in answering, and the resources I linked might help you on the way.

Goodluck!

Ibrahim_Can_Duran · August 31, 2023, 5:47am

Hi, i check the thread. Let me explain my equation to determine the resources. I have 4 TB data and i am going to use 1 replica so we can say 8TB of data. I will use active shards for indexing and keeping shard size in 50 GB so i need 160 active shards. In the documentations i follow it recommend 1:1.5 shard vCPU ratio for active shards, so it makes 240 vCPU and since i use lots of pod to reach that it automatically increase the RAM and ephemeral storage.

Christian_Dahlqvist · August 31, 2023, 6:05am

Where does this recommendation come from?

How you size your cluster will depend on the use case, so you will need to provide more details on this:

What is the use case?

Is your data immutable or are you performing updates?

What type of data are you indexing?

Will you be using time-based indices in some form, e.g. data streams?

How are you accessing and querying the data? Kibana? Custom APIs?

What is the average and peak indexing/update rate?

How many concurrent queries/searches do you expect to need to support?

How large portion of the data does each search/query typically target?

What are your latency requirements for queries/searches?

Ibrahim_Can_Duran · August 31, 2023, 10:28am

Hi @Christian_Dahlqvist ,
I dont have all the answers for your questions but below you can find my answers. I hope that would be enought to determine a way.

Where does this recommendation come from?

I coudn't recall but as far as remember 1:1.5 ratio recommended for starting point. Checking the system performance and adjust this value are also recommended.

What is the use case?

Store cell phone call data and visualize it.

Is your data immutable or are you performing updates?

It can be updatable

What type of data are you indexing?

Location based information
Numbers and texts

Will you be using time-based indices in some form, e.g. data streams?

Some indices will be time-based.

How are you accessing and querying the data? Kibana? Custom APIs?

Kibana and custom Kibana plugins.

What is the average and peak indexing/update rate?

We have 4 TB of data for indexing within 2 weeks.

How many concurrent queries/searches do you expect to need to support?

N/A

How large portion of the data does each search/query typically target?

It can be 10k with scrolling or any pagination method to compare two sites.
In general, metrics will be calculated with aggregation instead of searching.

What are your latency requirements for queries/searches?

N/A

Ibrahim_Can_Duran · September 4, 2023, 7:40am

Hi @Christian_Dahlqvist,

Do you have any comments with respect to my comment ? It would be great if you could help me on this.

system · October 2, 2023, 7:40am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How much cpu power needed for elk consider security use case? Elastic Security	5	327	January 30, 2024
Elasticsearch sizing calculation Elasticsearch	3	13877	July 5, 2017
Shards and replicas allocation in elasticsearch Elasticsearch	7	474	December 17, 2018
Hardware requirement ELK Elasticsearch	4	6828	October 23, 2019
Elasticsearch Resource requirements Elasticsearch	4	5378	February 6, 2019

Elasticsearch resource calculation

Related topics