Calculate optimal number of nodes

jeangld · February 7, 2012, 10:21am

Hi,

I'm new to elasticsearch and try to understand the basics. I try to
calculate the correct number of nodes for a specific index size. Let's
start with an example:

The index size is 100 GB, number of replicas=2, number of shards does
not matter (I guess), so for optimal performance, there should be
100GB + 2x100GB = 300GB in memory.

If my servers have 32GB of ram, I would need 10 of those = rougly 30GB
for elasticsearch and 2GB for the operating system. Is this correct?

Half a year later, the index grows to 200GB, my options are either
a) to add another 10 servers with 32GB of ram or
b) replace the old 32GB-ram servers with 10 new 64GB ram servers.

From what I've read so far I can't mix servers with different speeds
and different memory sizes, the slowest one is always the bottleneck.

Thanks,

Jean

kimchy · February 7, 2012, 12:11pm

The full index is not stored in memory, so you don't need to have memory based on the size of the index. The memory usage is mainly driven by the interval terms loaded to memory (sort of a skip list to make searches faster, defaults to every 128 term), and if you do sorting / faceting on fields (that part you can tell using the node stats API).

On Tuesday, February 7, 2012 at 12:21 PM, jeangld@yahoo.com wrote:

Hi,

I'm new to elasticsearch and try to understand the basics. I try to
calculate the correct number of nodes for a specific index size. Let's
start with an example:

The index size is 100 GB, number of replicas=2, number of shards does
not matter (I guess), so for optimal performance, there should be
100GB + 2x100GB = 300GB in memory.

If my servers have 32GB of ram, I would need 10 of those = rougly 30GB
for elasticsearch and 2GB for the operating system. Is this correct?

Half a year later, the index grows to 200GB, my options are either
a) to add another 10 servers with 32GB of ram or
b) replace the old 32GB-ram servers with 10 new 64GB ram servers.

From what I've read so far I can't mix servers with different speeds
and different memory sizes, the slowest one is always the bottleneck.

Thanks,

Jean

jeangld · February 7, 2012, 12:53pm

Thanks for the fast reply, which number should I look at from the node
stats? I always thought it is best to have the full index (without the
data file *.fdt) in memory.

Topic		Replies	Views
Memory allocation against the index size Elasticsearch	3	1065	July 26, 2019
Trying to optimize Elasticsearch cluster Elasticsearch	3	963	February 20, 2017
How do we know how much memory we need? Elasticsearch	1	415	July 5, 2017
Choosing a size for nodes Elasticsearch	1	447	August 29, 2017
Shards and replicas allocation in elasticsearch Elasticsearch	7	474	December 17, 2018

Calculate optimal number of nodes

Related topics