Elasticsearch Capacity Planning Help Required

nuwancs · October 27, 2019, 3:28am

I am working on a elasticsearch cluster to have following requirements,

2000 TPS
10 kb payload
3 months retention period without summarising
1 year summarised data.

I have tried the same with 200tps instead 2000tps in a 3 node cluster with 16 core 64 GB computers. but it didn't even load a simple dashboard for a one month worth of data. I searched internet, but couldn't find a proper document to create elasticsearch cluster for such large data set.

Can someone help me on this?

Christian_Dahlqvist · October 27, 2019, 8:48am

If we assume this is the average ingest rate, that means that you are looking to ingest 173 million documents per day with a total size of around 1.6TB as there documents are quite large. If we make the simplified assumption that the data takes up the same amount of space when indexed into primary shards and then add a replica for resiliency, we end up with 96TB of data on disk just for the first 30 days.

Exact how much data a single node can handle will depend on a number of factors, e.g. heap usage, query latency requirements, and hardware profiles. This can be hard to estimate accurately and often requires some testing as it highly depends on the data and use case. It is however very likely that you are looking at a cluster significantly larger than 3 nodes.

I would recommend looking at the following resources:

https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

nuwancs · October 27, 2019, 2:30pm

Thank you @Christian_Dahlqvist for the quick response.

Regarding the other factors you have mentioned,
This system mostly used by internal teams, so latency within 30s is alright and only a few (below 10) concurrent users are expected. I can request the hardware resource required. Is it possible for you to give me a rough cluster architecture for the above requirements and then I can alter the settings to get the maximum performance?

Thank you for the links. I will do through those docs and get back to you.

system · November 24, 2019, 2:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Cluster for distributed mode Elasticsearch	4	1091	July 5, 2017
Hardware for ELK Elasticsearch	8	528	May 7, 2018
Capacity planning for 200GB data /day and retention period of 30 days Elasticsearch	1	562	April 6, 2017
Elasticsearch Sizing in Petabyes Elasticsearch	2	687	February 26, 2018
Architecture production Elasticsearch Elasticsearch	2	346	July 6, 2018

Elasticsearch Capacity Planning Help Required

Related topics