Memory/cpu ratio to disk size

Could I get some comments on concerns/ insights on the following resource( cpu/ memory and disk size) configuration for one of my Elasticsearch cluster?

Data volume:

  • throughput: 18K docs/ second ( very continous load)
  • size: 720Gb per day.

index setting:
replica: 1
shard: 18 shards

node configurs:

3 cordinating nodes

  • for each node: 8 cpus, 32 GB memory, 16GB java heap.

3 master nodes:

  • for each node: 1 cpu, 8GB memory, 4Gb java heap, 50GB ssd disk

6 data nodes:

  • for each node: 20 cpus, 100 GB memory, 32GB java heap, 10TB data disks

Thanks!

I have a few questions:

  • How large and complex are your documents?

  • What is your retention period?

  • How will you query the data? How frequently? What are the query latency requirements?

  • Are you using the latest version?

  • What type of storage will your data nodes use?

Thanks!

1, each document is a piece of log, like lo4j logs and nginx logs. The size for each document is from 1000 bytes to 2000 bytes.
2, I am setting the retention period as 30 days
3, We are using kibana to query the logs. The query latency requirement is not that strict. like less than 1 minute for a complicated query, but several seconds for normal queries. Besides, I have a job to periodically query the last doc to calculate some latency between the timestamp in the doc and the time I am indexing the doc and some _cat api to get the current state of the cluster per 30 seconds.
4, we are using 7.1 version. btw, I think upgrading from 7.1 to 7.x should not be as hard as upgrading from 6.x to 7.x, right?
5, we are using ssd type of EBS. (e.g. io1 for AWS)

@Christian_Dahlqvist could you help give some insights when you have time. Thanks

I would recommend watching the following videos:

If we make the simplified assumption that your data will take up the same size on disk as the raw size and that you will have a replica for high availability you will generate 1.44TB indices per day. that will be around 7TB of data per node. As the nodes will be handling a lot of indexing as well as querying I would not be surprised to see some heap pressure before you reach that volume. I would therefore suspect you might need a larger cluster in terms of data nodes, but the only way to know for sure is to test.

Thanks so much for the guidance.

Also make sure you read this blog post about sharding practices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.