Hi,
I need to index ~1TB data per day.
I have the required HW and want to know which cluster should I raise, means How many nodes, How many shards, etc.
Is there any formula for that?
Thanks.
Hi,
I need to index ~1TB data per day.
I have the required HW and want to know which cluster should I raise, means How many nodes, How many shards, etc.
Is there any formula for that?
Thanks.
Have you looked at the docs around optimizing for indexing speed? Have you ensured that you have very fast storage, ideally fast local SSDs, as this often ends up being the bottleneck?
What type of data are you ingesting? Do you have many different types of data or is it uniform?
What is the retention period for your data?
I have all the required HW: NVMe drives, CPU, RAM.
At this moment I have 1 kind of data.
I want to understand which cluster (How many nodes, How many shards, etc.) should I raise in order to ingest ~1TB per day.
How long are you keeping the data in the cluster?
I have created delete policy of 30 days and rollover every 40GB shard
I would recommend you have a look at some of the blog posts and webinars around capacity planning that should be available on the Elastic website. I found this one which is about Elastic Cloud and quite old. Most of the concepts applied should still be applicable though.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.