hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?
question is subjective. good thing abt es is that u scale and throw in
servers as needed. plus query performance also depends on your index
settings/mappings/replicas.
i have a cluster instance at
digitalocean Pricing Overview | DigitalOcean
no of nodes: 10
master: 3 (1gb 1 core)
client: 5 (512mb 1 core)
data: 3 (8gb 4 core)
total shards: 82
~160M docs
On Wednesday, April 29, 2015 at 10:17:04 AM UTC+8, Xudong You wrote:
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more CPU
cores or more memory? Anyone has experience on the best combination of CPU
& Memory for ES performance?
I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM, but
our strategy is to move our platform to cloud, so VM is only choice.
On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:
Can you specify what kind of performance you mean?
mimimal response time for a single query
maximum throughput for all queries
For maximum performance, all kind of virtual machines are a bad choice in
comparison to physical machines in your own data center.
Jörg
On Wed, Apr 29, 2015 at 4:17 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?
First you need to find out if your workload is CPU-bound or if it is
network-bound.
If CPU-bound, go for the virtual machine with best CPU equipment.
If network bound, go for the virtual machine that offers best network
connectivity.
It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.
I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.
On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:
Can you specify what kind of performance you mean?
mimimal response time for a single query
maximum throughput for all queries
For maximum performance, all kind of virtual machines are a bad choice in
comparison to physical machines in your own data center.
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?
Thanks!
So per your experience, is Elasticsearch query more CPU-bound or IO-bound?
Anyway, I will do more perf testing with real data on different VMs to find
out the best CPU & Memory combination for my case.
On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:
First you need to find out if your workload is CPU-bound or if it is
network-bound.
If CPU-bound, go for the virtual machine with best CPU equipment.
If network bound, go for the virtual machine that offers best network
connectivity.
It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.
Jörg
On Wed, Apr 29, 2015 at 11:02 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:
I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.
On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:
Can you specify what kind of performance you mean?
mimimal response time for a single query
maximum throughput for all queries
For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.
hi,
I am building ES on cloud Virtual machines, the cloud platform provides
different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8 CPU cores,
14G memory etc. Different kind VM has different cost. To save our cost, I
want to choose the VM whose cost not exceed our budget and has best
performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?
When bulk-indexing documents, for example, my multi-threaded workload is
network-bound. It can easily be made CPU-bound by pre-processing documents
in single thread mode. Certain queries are CPU-bound, others not. If I
retrieve millions of documents in a row, decompression overweighs query
execution and result transport time. There are many knobs to turn, for
example, caching, or scan/scroll. Because of this, there is no fixed rule
for all situations.
Thanks!
So per your experience, is Elasticsearch query more CPU-bound or IO-bound?
Anyway, I will do more perf testing with real data on different VMs to find
out the best CPU & Memory combination for my case.
On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:
First you need to find out if your workload is CPU-bound or if it is
network-bound.
If CPU-bound, go for the virtual machine with best CPU equipment.
If network bound, go for the virtual machine that offers best network
connectivity.
It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.
I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.
On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:
Can you specify what kind of performance you mean?
mimimal response time for a single query
maximum throughput for all queries
For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.
hi,
I am building ES on cloud Virtual machines, the cloud platform
provides different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8
CPU cores, 14G memory etc. Different kind VM has different cost. To save
our cost, I want to choose the VM whose cost not exceed our budget and has
best performance or query.
So, from query performance point of view, should I choose VM with more
CPU cores or more memory? Anyone has experience on the best combination of
CPU & Memory for ES performance?
On Thursday, April 30, 2015 at 3:13:36 PM UTC+8, Jörg Prante wrote:
As said, it depends.
When bulk-indexing documents, for example, my multi-threaded workload is
network-bound. It can easily be made CPU-bound by pre-processing documents
in single thread mode. Certain queries are CPU-bound, others not. If I
retrieve millions of documents in a row, decompression overweighs query
execution and result transport time. There are many knobs to turn, for
example, caching, or scan/scroll. Because of this, there is no fixed rule
for all situations.
Jörg
On Thu, Apr 30, 2015 at 7:56 AM, Xudong You <xudon...@gmail.com
<javascript:>> wrote:
Thanks!
So per your experience, is Elasticsearch query more CPU-bound or
IO-bound? Anyway, I will do more perf testing with real data on different
VMs to find out the best CPU & Memory combination for my case.
On Wednesday, April 29, 2015 at 9:57:12 PM UTC+8, Jörg Prante wrote:
First you need to find out if your workload is CPU-bound or if it is
network-bound.
If CPU-bound, go for the virtual machine with best CPU equipment.
If network bound, go for the virtual machine that offers best network
connectivity.
It is very hard to get precise numbers for performance metrics in public
virtual machines because there are many others using the same resources at
the the same time in an unpredictable way.
I want better maximum throughput for all queries.
As for VM vs Physical machines. I agree that physical machines beat VM,
but our strategy is to move our platform to cloud, so VM is only choice.
On Wednesday, April 29, 2015 at 4:30:19 PM UTC+8, Jörg Prante wrote:
Can you specify what kind of performance you mean?
mimimal response time for a single query
maximum throughput for all queries
For maximum performance, all kind of virtual machines are a bad choice
in comparison to physical machines in your own data center.
hi,
I am building ES on cloud Virtual machines, the cloud platform
provides different tier VMs to choose, say, 4 CPU cores, 28G memory, or 8
CPU cores, 14G memory etc. Different kind VM has different cost. To save
our cost, I want to choose the VM whose cost not exceed our budget and has
best performance or query.
So, from query performance point of view, should I choose VM with
more CPU cores or more memory? Anyone has experience on the best
combination of CPU & Memory for ES performance?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.