Maximum number of disks/storage size per data node

Ali_Nazemian · May 8, 2018, 6:12am

I was wondering what the maximum recommended disk size/number of disks are per each data node. What would be the consequences of using big data nodes (24TB+) on a heavy indexing cluster?

Christian_Dahlqvist · May 8, 2018, 6:53am

As described in this blog post, each shard comes with some overhead in terms of heap usage. Exactly how much depends on what type of data you are storing as well as the size of the shards. You also need heap space for indexing and querying data.

Indexing is very I/O intensive and can use a lot of heap, so in order to optimise storage on nodes it is common to implement a hot/warm architecture. This means that a subset of nodes in the cluster (aka hot tier), equipped with fast SSD storage and good amount of CPU, handle all indexing of new data as well as querying of the most recently indexed data. These hold relatively little data as a lot of heap is used for indexing and querying.

A separate set of nodes (aka warm tier), hold indices that are more then a few days old. These are typically not indexed into, which means heap can be dedicated to querying and shard overhead. These nodes typically have large volumes of spinning disks and can hold much more data than the hot nodes. This is where 'big data nodes' are more suitable.

Exactly how much data you can put on a node will depend on what type of data you have, how effectively you can minimize overhead and how much heap you need set aside for querying.

system · June 5, 2018, 6:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there Disk Limit on Warm/Hot Nodes Elasticsearch	6	5036	June 13, 2018
Realistic Sizing for ElasticSearch Elasticsearch	2	373	March 13, 2020
How much data for a warm node Elasticsearch	2	372	January 10, 2019
One single node can support 80TB data to search? Elasticsearch	3	595	May 31, 2018
Storage per heap? Elasticsearch	3	966	July 5, 2017

Maximum number of disks/storage size per data node

Related topics