Idealized server setup?


(Josh Harrison) #1

When designing a dedicated ES cluster, does it make more sense to have two
or three servers with a ton of resources each, or 5-10 cheaper commodity
hardware systems?
I know when an http request comes into a given machine, it can
automatically be routed to another. Does this routing happen over 9200 or
9300? I'm picturing having the cheap machines connected to both our
internal network for 9200, and to each other on a private 10.0.0.0 network
for 9300. Would that result in a performance boost?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8bd2d826-549e-472b-b5bf-249a3e1c9767%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #2

On Mon, Dec 9, 2013 at 2:47 PM, Josh Harrison hijakk@gmail.com wrote:

When designing a dedicated ES cluster, does it make more sense to have two
or three servers with a ton of resources each, or 5-10 cheaper commodity
hardware systems?

I'd go commodity to get more ram. The sweet spot (I've heard) is 64gb per
machine with each one running with a 30gb heap.

I know when an http request comes into a given machine, it can
automatically be routed to another. Does this routing happen over 9200 or
9300?

9300 normally. If you run more than one ES on the node on will run on
9301, I believe.

I'm picturing having the cheap machines connected to both our internal
network for 9200, and to each other on a private 10.0.0.0 network for 9300.
Would that result in a performance boost?

I think there are too many variables here. So long as you have fast
switched connections between the machines you should be ok.

Elasticsearch currently doesn't know how to handle nodes of significantly
different power automatically. It won't balance automatically based on
machine power but there are things you can do to help that some.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8bd2d826-549e-472b-b5bf-249a3e1c9767%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd03KCW%3DjYQxbu1y_c1Na36JNdJ%3DWf_ipa1sR9gTssj%3D%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #3

No more than 64G (with a 32G heap) is best as java doesn't compress
pointers over 32g, which means you lose out.
We were running a cluster of 8 nodes with 512G per node and the resource
wastage was immense, as was the GC! Not to mention java wouldn't even run
with a 256G heap size.

The docs state that ES will listen on any IP (
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.html),
but that it will take the first one it finds. So you can't have it
listening on multiple addresses.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 10 December 2013 06:53, Nikolas Everett nik9000@gmail.com wrote:

On Mon, Dec 9, 2013 at 2:47 PM, Josh Harrison hijakk@gmail.com wrote:

When designing a dedicated ES cluster, does it make more sense to have
two or three servers with a ton of resources each, or 5-10 cheaper
commodity hardware systems?

I'd go commodity to get more ram. The sweet spot (I've heard) is 64gb per
machine with each one running with a 30gb heap.

I know when an http request comes into a given machine, it can
automatically be routed to another. Does this routing happen over 9200 or
9300?

9300 normally. If you run more than one ES on the node on will run on
9301, I believe.

I'm picturing having the cheap machines connected to both our internal
network for 9200, and to each other on a private 10.0.0.0 network for 9300.
Would that result in a performance boost?

I think there are too many variables here. So long as you have fast
switched connections between the machines you should be ok.

Elasticsearch currently doesn't know how to handle nodes of significantly
different power automatically. It won't balance automatically based on
machine power but there are things you can do to help that some.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8bd2d826-549e-472b-b5bf-249a3e1c9767%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd03KCW%3DjYQxbu1y_c1Na36JNdJ%3DWf_ipa1sR9gTssj%3D%3DA%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZvhjtHHqrU4%3Dn5kxsS2dCqAPj1FMJchm6nd-W2Ccn9jw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

What performance do you ask for: maximum speed for executing a query? Or
maximum throughput of the overall system for all queries?

In general, ES is not designed for vertical scaling on few big oomph
machines. ES design for scaling out horizontally over lots of commodity
machines of same type. Note that you can not get faster the more machines
you add, but you get higher overall throughput.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHWtw_vb1GABZ5fsZDsdHO9m%2BrbYO3GUAFjVnzmmKHFvw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Josh Harrison) #5

Great, thanks all. Better throughput is the goal. I'll have to see if I can
scrounge some decent systems up!

On Monday, December 9, 2013 3:47:28 PM UTC-8, Jörg Prante wrote:

What performance do you ask for: maximum speed for executing a query? Or
maximum throughput of the overall system for all queries?

In general, ES is not designed for vertical scaling on few big oomph
machines. ES design for scaling out horizontally over lots of commodity
machines of same type. Note that you can not get faster the more machines
you add, but you get higher overall throughput.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e7d8df77-0874-47cb-896c-e0f4d06ab47e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Josh Harrison) #6

Hm, ok, so ES may not deal with substantially different capabilities of
machines in terms of speed, but if I can throw a bunch of older systems
with only a few GB of ram and a few hundred GB of storage space, is ES
aware of the space constraints - distributing shards and replicas so that
they don't hit the storage capacity limit right away?
Thanks,
Josh

On Tuesday, December 10, 2013 9:07:52 AM UTC-8, Josh Harrison wrote:

Great, thanks all. Better throughput is the goal. I'll have to see if I
can scrounge some decent systems up!

On Monday, December 9, 2013 3:47:28 PM UTC-8, Jörg Prante wrote:

What performance do you ask for: maximum speed for executing a query? Or
maximum throughput of the overall system for all queries?

In general, ES is not designed for vertical scaling on few big oomph
machines. ES design for scaling out horizontally over lots of commodity
machines of same type. Note that you can not get faster the more machines
you add, but you get higher overall throughput.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31edc157-1487-4c39-b34c-d856b827f816%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #7

On Tue, Dec 10, 2013 at 1:50 PM, Josh Harrison hijakk@gmail.com wrote:

Hm, ok, so ES may not deal with substantially different capabilities of
machines in terms of speed, but if I can throw a bunch of older systems
with only a few GB of ram and a few hundred GB of storage space, is ES
aware of the space constraints - distributing shards and replicas so that
they don't hit the storage capacity limit right away?

Hard disk, yes:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk
Ram, I don't think so.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2N0Hnm8fjjL7bTaQzQgzwB_B2Fcq2yos9kkTrAiYZ_aA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #8