Hi ES Experts,
I am new to Elasticsearch and have following use case to please suggest me to the right direction.
Daily Data volume :- 400+ TB Daily.
How many node, cluster, index, shards and replicas do I need. Additionally, our servers will be Linux and it would be great if you advise us to have the correct hardware that can hold and control the big data that we receive CPU, Core, RAM, SSD or HDD
I hope I am clear with my use case. Can you please give me some suggestions?
I look forward to hearing from you in the near future.
400+ TB of data per day is a lot, so is likely to involve a significant amount of hardware. You have also not described what type of data you have nor how long you need to keep it or how you are going to search/analyze it, which are all important factors.
Thanks for getting back to me Chris.
Back to your questions:
- Our data type is structure/unstructured text data in JSON files.
- Search and Analyze is searching and aggregation.
How long do you need to keep data in the cluster?
Is the raw data format JSON? What is the average event size?
How many concurrent users are expected? Will you be using Kibana?
Thasnk, can you explain what do u mean by how long do u keep data in the cluster?
The format has not build yet.
We will use Kibana but users not yet?.
I am asking about retention period.
Sorry for the late reply. we are plan to have data for one year.
if you ingest 400TB a day and keep this for a year, that is roughly 143PB of raw data. I doubt you will find anyone here that can give you an estimate for a use case that size, so you probably need to roll up your sleeves and do some benchmarking...
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.