I am working on a elasticsearch cluster to have following requirements,
2000 TPS
10 kb payload
3 months retention period without summarising
1 year summarised data.
I have tried the same with 200tps instead 2000tps in a 3 node cluster with 16 core 64 GB computers. but it didn't even load a simple dashboard for a one month worth of data. I searched internet, but couldn't find a proper document to create elasticsearch cluster for such large data set.
If we assume this is the average ingest rate, that means that you are looking to ingest 173 million documents per day with a total size of around 1.6TB as there documents are quite large. If we make the simplified assumption that the data takes up the same amount of space when indexed into primary shards and then add a replica for resiliency, we end up with 96TB of data on disk just for the first 30 days.
Exact how much data a single node can handle will depend on a number of factors, e.g. heap usage, query latency requirements, and hardware profiles. This can be hard to estimate accurately and often requires some testing as it highly depends on the data and use case. It is however very likely that you are looking at a cluster significantly larger than 3 nodes.
I would recommend looking at the following resources:
Regarding the other factors you have mentioned,
This system mostly used by internal teams, so latency within 30s is alright and only a few (below 10) concurrent users are expected. I can request the hardware resource required. Is it possible for you to give me a rough cluster architecture for the above requirements and then I can alter the settings to get the maximum performance?
Thank you for the links. I will do through those docs and get back to you.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.