More memory or more CPU cores help better performance?


(Xudong You) #1

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

Depends - you will want to do some tests to see what sort of resources your
use case requires.
Start with smaller machines and go from there.

On 29 April 2015 at 12:17, Xudong You xudong.you@gmail.com wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8aSE92XiUnu6JotvRnjXif2kxbQBRSo%2BkUKa8Q1cujTw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Ishafizan Ishak) #3

question is subjective. good thing abt es is that u scale and throw in
servers as needed. plus query performance also depends on your index
settings/mappings/replicas.
i have a cluster instance at
digitalocean https://www.digitalocean.com/pricing/

no of nodes: 10

  • master: 3 (1gb 1 core)
  • client: 5 (512mb 1 core)
  • data: 3 (8gb 4 core)

total shards: 82
~160M docs

On Wednesday, April 29, 2015 at 10:17:04 AM UTC+8, Xudong You wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ebc21ece-e250-478f-b933-7eb37d0e8dfd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice in
comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You xudong.you@gmail.com wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGv8o1kinbWmV18y%3DnJHESKNH21U2s5-w2TDGhwCbHiEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Xudong You) #5

I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM, but
our strategy is to move our platform to cloud, so VM is only choice.

On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice in
comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #6

First you need to find out if your workload is CPU-bound or if it is
network-bound.

If CPU-bound, go for the virtual machine with best CPU equipment.

If network bound, go for the virtual machine that offers best network
connectivity.

It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.

Jörg

On Wed, Apr 29, 2015 at 11:02 AM, Xudong You xudong.you@gmail.com wrote:

I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.

On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice in
comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You xudon...@gmail.com wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFy0%2BNOuOqRD0ia392hY2c9Thr9SXG2%3DQZetMxryjGOtQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Xudong You) #7

Thanks!
So per your experience, is Elasticsearch query more CPU-bound or IO-bound?
Anyway, I will do more perf testing with real data on different VMs to find
out the best CPU & Memory combination for my case.

On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:

First you need to find out if your workload is CPU-bound or if it is
network-bound.

If CPU-bound, go for the virtual machine with best CPU equipment.

If network bound, go for the virtual machine that offers best network
connectivity.

It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.

Jörg

On Wed, Apr 29, 2015 at 11:02 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:

I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.

On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You xudon...@gmail.com wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ebe219f-9422-48ed-a84f-d8606f16a7bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #8

As said, it depends.

When bulk-indexing documents, for example, my multi-threaded workload is
network-bound. It can easily be made CPU-bound by pre-processing documents
in single thread mode. Certain queries are CPU-bound, others not. If I
retrieve millions of documents in a row, decompression overweighs query
execution and result transport time. There are many knobs to turn, for
example, caching, or scan/scroll. Because of this, there is no fixed rule
for all situations.

Jörg

On Thu, Apr 30, 2015 at 7:56 AM, Xudong You xudong.you@gmail.com wrote:

Thanks!
So per your experience, is Elasticsearch query more CPU-bound or IO-bound?
Anyway, I will do more perf testing with real data on different VMs to find
out the best CPU & Memory combination for my case.

On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:

First you need to find out if your workload is CPU-bound or if it is
network-bound.

If CPU-bound, go for the virtual machine with best CPU equipment.

If network bound, go for the virtual machine that offers best network
connectivity.

It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.

Jörg

On Wed, Apr 29, 2015 at 11:02 AM, Xudong You xudon...@gmail.com wrote:

I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.

On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You xudon...@gmail.com wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform
provides different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8
CPU cores, 14G memory etc. Different kind VM has different cost. To save
our cost, I want to choose the VM whose cost not exceed our budget and has
best performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ebe219f-9422-48ed-a84f-d8606f16a7bf%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ebe219f-9422-48ed-a84f-d8606f16a7bf%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFgbp5qii5m2Eq_JMGPPeGM%2BqZMp-uJuiFUdyOZMkrVXw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Xudong You) #9

Thanks a lot Jörg!

On Thursday, April 30, 2015 at 3:13:36 PM UTC+8, Jörg Prante wrote:

As said, it depends.

When bulk-indexing documents, for example, my multi-threaded workload is
network-bound. It can easily be made CPU-bound by pre-processing documents
in single thread mode. Certain queries are CPU-bound, others not. If I
retrieve millions of documents in a row, decompression overweighs query
execution and result transport time. There are many knobs to turn, for
example, caching, or scan/scroll. Because of this, there is no fixed rule
for all situations.

Jörg

On Thu, Apr 30, 2015 at 7:56 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:

Thanks!
So per your experience, is Elasticsearch query more CPU-bound or
IO-bound? Anyway, I will do more perf testing with real data on different
VMs to find out the best CPU & Memory combination for my case.

On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:

First you need to find out if your workload is CPU-bound or if it is
network-bound.

If CPU-bound, go for the virtual machine with best CPU equipment.

If network bound, go for the virtual machine that offers best network
connectivity.

It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.

Jörg

On Wed, Apr 29, 2015 at 11:02 AM, Xudong You xudon...@gmail.com wrote:

I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.

On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:

Can you specify what kind of performance you mean?

  • mimimal response time for a single query
  • maximum throughput for all queries

For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.

Jörg

On Wed, Apr 29, 2015 at 4:17 AM, Xudong You xudon...@gmail.com
wrote:

hi,
I am building ES on cloud Virtual machines, the cloud platform
provides different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8
CPU cores, 14G memory etc. Different kind VM has different cost. To save
our cost, I want to choose the VM whose cost not exceed our budget and has
best performance or query.
So, from query performance point of view, should I choose VM with
more CPU cores or more memory? Anyone has experience on the best
combination of CPU & Memory for ES performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e449f0bb-5c92-4aee-84f5-285171e8070c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5e1aee9b-f2d5-4340-9a8e-d30b786cda5a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ebe219f-9422-48ed-a84f-d8606f16a7bf%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ebe219f-9422-48ed-a84f-d8606f16a7bf%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5ca5f17d-c958-4dc8-80fc-623f60a7c733%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #10