Could I get some comments on concerns/ insights on the following resource( cpu/ memory and disk size) configuration for one of my Elasticsearch cluster?
Data volume:
throughput: 18K docs/ second ( very continous load)
size: 720Gb per day.
index setting:
replica: 1
shard: 18 shards
node configurs:
3 cordinating nodes
for each node: 8 cpus, 32 GB memory, 16GB java heap.
3 master nodes:
for each node: 1 cpu, 8GB memory, 4Gb java heap, 50GB ssd disk
6 data nodes:
for each node: 20 cpus, 100 GB memory, 32GB java heap, 10TB data disks
1, each document is a piece of log, like lo4j logs and nginx logs. The size for each document is from 1000 bytes to 2000 bytes.
2, I am setting the retention period as 30 days
3, We are using kibana to query the logs. The query latency requirement is not that strict. like less than 1 minute for a complicated query, but several seconds for normal queries. Besides, I have a job to periodically query the last doc to calculate some latency between the timestamp in the doc and the time I am indexing the doc and some _cat api to get the current state of the cluster per 30 seconds.
4, we are using 7.1 version. btw, I think upgrading from 7.1 to 7.x should not be as hard as upgrading from 6.x to 7.x, right?
5, we are using ssd type of EBS. (e.g. io1 for AWS)
If we make the simplified assumption that your data will take up the same size on disk as the raw size and that you will have a replica for high availability you will generate 1.44TB indices per day. that will be around 7TB of data per node. As the nodes will be handling a lot of indexing as well as querying I would not be surprised to see some heap pressure before you reach that volume. I would therefore suspect you might need a larger cluster in terms of data nodes, but the only way to know for sure is to test.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.