Multiple nodes on same machine

Mingfeng_Yang · March 19, 2013, 11:35pm

I plan to use elasticsearch as documentation retrieval engine which will
serve hundreds of millions of documents, but the query rate will be low.
The ES cluster will probably receive a few queries only each hour.

We are planning to use ec2 m2.2xlarge instance, each with 32G memory and 4
CPU cores, so I like to run 4 ES nodes on each ec2 instance to maximize the
CPU utilization rate. In this case, is it beneficial to run multiple nodes
on same machine?

My own experience with Solr is that it does help to use resources more
efficiently.

Regards,
Ming

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · March 20, 2013, 12:18am

No, it is not beneficial.

Here are the reasons:

a) if you start many JVMs, you create a JVM-induced overhead. That is,
JVMs compete for the resources the OS provide (CPU, network, memory).
Because the OS must decide which JVM does get which resources, it takes
more time and space to make decisions, and this is not negelectible. The
more JVMs you execute in parallel, the higher the risk of overall system
degradation and in many cases the risk of paging (swapping) is higher.

b) the ES code is optimized for scalability. What does that mean? You
can increase the parameters for CPU (threads), memory (heap) and network
(netty pools) for the ES JVM and this increases the overall power as
much as your machine can get along with it. There is no reason why you
should not dedicate a whole machine to one single ES node.

c) a single ES JVM can manage hundreds or thousands of Lucene indexes at
once. This is done by index sharding and automatic workload
distribution. Each node can hold many indices with many index shards. An
ES node does not restrict you to a model of a single index with a single
shard.

Jörg

Am 20.03.13 00:35, schrieb mfyang@wisewindow.com:

In this case, is it beneficial to run multiple nodes on same machine?

My own experience with Solr is that it does help to use resources more
efficiently.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mingfeng_Yang · March 20, 2013, 1:08am

Jorg,

Thanks for the info, very useful. So basically I can run one ES instance
which holds multiple shards, and once each shard gets big, I can migrate
them to separate machines?

Thanks,
Ming

On Tuesday, March 19, 2013 5:18:02 PM UTC-7, Jörg Prante wrote:

No, it is not beneficial.

Here are the reasons:

a) if you start many JVMs, you create a JVM-induced overhead. That is,
JVMs compete for the resources the OS provide (CPU, network, memory).
Because the OS must decide which JVM does get which resources, it takes
more time and space to make decisions, and this is not negelectible. The
more JVMs you execute in parallel, the higher the risk of overall system
degradation and in many cases the risk of paging (swapping) is higher.

b) the ES code is optimized for scalability. What does that mean? You
can increase the parameters for CPU (threads), memory (heap) and network
(netty pools) for the ES JVM and this increases the overall power as
much as your machine can get along with it. There is no reason why you
should not dedicate a whole machine to one single ES node.

c) a single ES JVM can manage hundreds or thousands of Lucene indexes at
once. This is done by index sharding and automatic workload
distribution. Each node can hold many indices with many index shards. An
ES node does not restrict you to a model of a single index with a single
shard.

Jörg

Am 20.03.13 00:35, schrieb mfy...@wisewindow.com <javascript:>:

In this case, is it beneficial to run multiple nodes on same machine?

My own experience with Solr is that it does help to use resources more
efficiently.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Andy_Wick · March 20, 2013, 12:43pm

Totally agree with 32G machines, but as memory gets cheaper and cheaper I'm
curious if anyone has actually done any benchmarking or stress tests on the
single vs multi node with large memory machines.

We actually run 2 nodes (22G each) on our 64G machines.

a) so we can have -XX:+UseCompressedOops
b) with the theory (untested) that GC pauses will be faster/less often/...

On Tuesday, March 19, 2013 8:18:02 PM UTC-4, Jörg Prante wrote:

No, it is not beneficial.

Here are the reasons:

a) if you start many JVMs, you create a JVM-induced overhead. That is,
JVMs compete for the resources the OS provide (CPU, network, memory).
Because the OS must decide which JVM does get which resources, it takes
more time and space to make decisions, and this is not negelectible. The
more JVMs you execute in parallel, the higher the risk of overall system
degradation and in many cases the risk of paging (swapping) is higher.

b) the ES code is optimized for scalability. What does that mean? You
can increase the parameters for CPU (threads), memory (heap) and network
(netty pools) for the ES JVM and this increases the overall power as
much as your machine can get along with it. There is no reason why you
should not dedicate a whole machine to one single ES node.

c) a single ES JVM can manage hundreds or thousands of Lucene indexes at
once. This is done by index sharding and automatic workload
distribution. Each node can hold many indices with many index shards. An
ES node does not restrict you to a model of a single index with a single
shard.

Jörg

Am 20.03.13 00:35, schrieb mfy...@wisewindow.com <javascript:>:

In this case, is it beneficial to run multiple nodes on same machine?

My own experience with Solr is that it does help to use resources more
efficiently.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

vineeth_mohan · March 20, 2013, 1:19pm

I have heard that multi core is maximum utlized with different process
rather than different threads.
If that is true and if the machine has many cores , wont muliple instance
be a good idea ?

Thanks
Vineeth

On Wed, Mar 20, 2013 at 6:13 PM, Andy Wick andywick@gmail.com wrote:

Totally agree with 32G machines, but as memory gets cheaper and cheaper
I'm curious if anyone has actually done any benchmarking or stress tests on
the single vs multi node with large memory machines.

We actually run 2 nodes (22G each) on our 64G machines.

a) so we can have -XX:+UseCompressedOops
b) with the theory (untested) that GC pauses will be faster/less often/...

On Tuesday, March 19, 2013 8:18:02 PM UTC-4, Jörg Prante wrote:

No, it is not beneficial.

Here are the reasons:

a) if you start many JVMs, you create a JVM-induced overhead. That is,
JVMs compete for the resources the OS provide (CPU, network, memory).
Because the OS must decide which JVM does get which resources, it takes
more time and space to make decisions, and this is not negelectible. The
more JVMs you execute in parallel, the higher the risk of overall system
degradation and in many cases the risk of paging (swapping) is higher.

b) the ES code is optimized for scalability. What does that mean? You
can increase the parameters for CPU (threads), memory (heap) and network
(netty pools) for the ES JVM and this increases the overall power as
much as your machine can get along with it. There is no reason why you
should not dedicate a whole machine to one single ES node.

c) a single ES JVM can manage hundreds or thousands of Lucene indexes at
once. This is done by index sharding and automatic workload
distribution. Each node can hold many indices with many index shards. An
ES node does not restrict you to a model of a single index with a single
shard.

Jörg

Am 20.03.13 00:35, schrieb mfy...@wisewindow.com:

In this case, is it beneficial to run multiple nodes on same machine?

My own experience with Solr is that it does help to use resources more
efficiently.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Can I run multiple Elasticsearch nodes on the same machine? Elasticsearch	5	65888	July 5, 2017
Multiple nodes on a powerful system? Elasticsearch	5	431	July 6, 2017
Bad performance of 2-node-cluster Elasticsearch	4	431	July 6, 2017
Planning heap size for ES nodes Elasticsearch	3	405	July 6, 2017
Is it okay to have multiple ES nodes on one machine? Elasticsearch	4	356	February 29, 2020

Multiple nodes on same machine

Related topics