Running elasticsearch on VMWare ESX VM


#1

What's the general opinion about running elasticsearch on VMWare ESX VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there any
things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/36e4b0e1-ad6b-4514-b035-2bb0b200c2fa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Tony Su) #2

Although I'm running on a different VMware product, I've run other things
than ES on VMware before.
IMO is a good platform, no unusual issues.

Just the usual things...
If the VM was created on a different machine, then you may need to verify
no partition alignment issues.

In my current ES testing, most of my bottlenecks is the disk I/O.
Curiously, Marvel isn't reporting disk usage in the red, but it's obviously
the most loaded of the various resources being used (So, maybe it's not a
real bottleneck?). My current storage is shared SATA/SAS.

Like always, if you're dealing with SSD, about write amplification
and how to deal with it depending on whatever Guest OS you use.

Tony

On Wednesday, February 5, 2014 7:22:08 AM UTC-8, mooky wrote:

What's the general opinion about running elasticsearch on VMWare ESX VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there any
things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/844c34aa-acf1-4f8b-98ca-af970085521f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


#3

Thanks Tony.

I have had a few issues before with linux VM environment (ESX)

  • like single points of failure. SAN goes down, you lose all your servers.
  • like SAN controller gets saturated, and IOWait on all servers goes
    through the roof - everything runs veery slow.
  • I have heard other anecdotes regarding networking - and the ESX drivers
    not coping under heavy load. We certainly had loads of networking related
    problems - that many times we weren't able to get to the bottom - and the
    environment support personnel lacked the talent/resources to figure it out
    either.
  • many many problems due to oversubscription (but thats just down to the
    environment being run by muppets).

-N

On Wednesday, 5 February 2014 15:45:09 UTC, Tony Su wrote:

Although I'm running on a different VMware product, I've run other things
than ES on VMware before.
IMO is a good platform, no unusual issues.

Just the usual things...
If the VM was created on a different machine, then you may need to verify
no partition alignment issues.

In my current ES testing, most of my bottlenecks is the disk I/O.
Curiously, Marvel isn't reporting disk usage in the red, but it's obviously
the most loaded of the various resources being used (So, maybe it's not a
real bottleneck?). My current storage is shared SATA/SAS.

Like always, if you're dealing with SSD, about write amplification
and how to deal with it depending on whatever Guest OS you use.

Tony

On Wednesday, February 5, 2014 7:22:08 AM UTC-8, mooky wrote:

What's the general opinion about running elasticsearch on VMWare ESX VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there
any things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c34d45f5-5e0a-4861-8fcc-247cc6c82337%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Tony Su) #4

IMO

I would generally design the Guest images to run on local storage, not in a
SAN to decrease load on the network links. The Guest image shouldn't be
very large, anyway.
Data should be stored in the SAN for shared access.

Aside from that, any networking issues as usual depends on the hardware and
your configuration. Maybe you need fatter pipes or more pipes to your SAN,
but if you're currently running multiple Guest images in the SAN that'd
likely be a major load by itself.

Another thing you can do is use a distro/OS with recent architecture. A
couple years ago leading several Linux distros starting moving many of
their mountpoints into tmpfs which means they're running in RAM instead of
hitting your hard drive. In fact, for those OS, if you don't store data in
local storage, after initial boot the disks aren't touched much. Of course,
this won't generally include the "Long Term" distro versions but IMO Java
apps like ES aren't likely subject to changes in the OS outside the JVM.

HTH,
Tony

On Wednesday, February 5, 2014 8:42:43 AM UTC-8, mooky wrote:

Thanks Tony.

I have had a few issues before with linux VM environment (ESX)

  • like single points of failure. SAN goes down, you lose all your servers.
  • like SAN controller gets saturated, and IOWait on all servers goes
    through the roof - everything runs veery slow.
  • I have heard other anecdotes regarding networking - and the ESX drivers
    not coping under heavy load. We certainly had loads of networking related
    problems - that many times we weren't able to get to the bottom - and the
    environment support personnel lacked the talent/resources to figure it out
    either.
  • many many problems due to oversubscription (but thats just down to the
    environment being run by muppets).

-N

On Wednesday, 5 February 2014 15:45:09 UTC, Tony Su wrote:

Although I'm running on a different VMware product, I've run other things
than ES on VMware before.
IMO is a good platform, no unusual issues.

Just the usual things...
If the VM was created on a different machine, then you may need to verify
no partition alignment issues.

In my current ES testing, most of my bottlenecks is the disk I/O.
Curiously, Marvel isn't reporting disk usage in the red, but it's obviously
the most loaded of the various resources being used (So, maybe it's not a
real bottleneck?). My current storage is shared SATA/SAS.

Like always, if you're dealing with SSD, about write amplification
and how to deal with it depending on whatever Guest OS you use.

Tony

On Wednesday, February 5, 2014 7:22:08 AM UTC-8, mooky wrote:

What's the general opinion about running elasticsearch on VMWare ESX
VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there
any things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a5ad4e79-5e4b-45d5-9ba5-d3797b3cd5d9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #5

You probably don't want to run ES with a large number of nodes off a SAN
due to the IO requirements.
We run ES on a XenServer cluster, everything is on local disk though and we
push redundancy up the stack and let ES handle it with replicas.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 6 February 2014 04:11, Tony Su tonysu999@gmail.com wrote:

IMO

I would generally design the Guest images to run on local storage, not in
a SAN to decrease load on the network links. The Guest image shouldn't be
very large, anyway.
Data should be stored in the SAN for shared access.

Aside from that, any networking issues as usual depends on the hardware
and your configuration. Maybe you need fatter pipes or more pipes to your
SAN, but if you're currently running multiple Guest images in the SAN
that'd likely be a major load by itself.

Another thing you can do is use a distro/OS with recent architecture. A
couple years ago leading several Linux distros starting moving many of
their mountpoints into tmpfs which means they're running in RAM instead of
hitting your hard drive. In fact, for those OS, if you don't store data in
local storage, after initial boot the disks aren't touched much. Of course,
this won't generally include the "Long Term" distro versions but IMO Java
apps like ES aren't likely subject to changes in the OS outside the JVM.

HTH,
Tony

On Wednesday, February 5, 2014 8:42:43 AM UTC-8, mooky wrote:

Thanks Tony.

I have had a few issues before with linux VM environment (ESX)

  • like single points of failure. SAN goes down, you lose all your servers.
  • like SAN controller gets saturated, and IOWait on all servers goes
    through the roof - everything runs veery slow.
  • I have heard other anecdotes regarding networking - and the ESX drivers
    not coping under heavy load. We certainly had loads of networking related
    problems - that many times we weren't able to get to the bottom - and the
    environment support personnel lacked the talent/resources to figure it out
    either.
  • many many problems due to oversubscription (but thats just down to the
    environment being run by muppets).

-N

On Wednesday, 5 February 2014 15:45:09 UTC, Tony Su wrote:

Although I'm running on a different VMware product, I've run other
things than ES on VMware before.
IMO is a good platform, no unusual issues.

Just the usual things...
If the VM was created on a different machine, then you may need to
verify no partition alignment issues.

In my current ES testing, most of my bottlenecks is the disk I/O.
Curiously, Marvel isn't reporting disk usage in the red, but it's obviously
the most loaded of the various resources being used (So, maybe it's not a
real bottleneck?). My current storage is shared SATA/SAS.

Like always, if you're dealing with SSD, about write
amplification and how to deal with it depending on whatever Guest OS you
use.

Tony

On Wednesday, February 5, 2014 7:22:08 AM UTC-8, mooky wrote:

What's the general opinion about running elasticsearch on VMWare ESX
VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there
any things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a5ad4e79-5e4b-45d5-9ba5-d3797b3cd5d9%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YvkHAitGrE-9Q%3DZK%3D54S1sbNaia1zcqx4PCh1DtW05pg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Tony Su) #6

Hi Mark,
Although I'm sure the type of query would make a big diff so would lessen
the value of your scenario profile vs others,

If you observed the I/O bottleneck in practice (my guess likely network
related), I'm curious enough if you could describe a little of the
scenario, eg

queries per second
type and size of network connection between the Host(s) and SAN
number of data and query nodes

Thx if available,
Tony

On Wednesday, February 5, 2014 2:47:44 PM UTC-8, Mark Walkom wrote:

You probably don't want to run ES with a large number of nodes off a SAN
due to the IO requirements.
We run ES on a XenServer cluster, everything is on local disk though and
we push redundancy up the stack and let ES handle it with replicas.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 6 February 2014 04:11, Tony Su <tony...@gmail.com <javascript:>> wrote:

IMO

I would generally design the Guest images to run on local storage, not in
a SAN to decrease load on the network links. The Guest image shouldn't be
very large, anyway.
Data should be stored in the SAN for shared access.

Aside from that, any networking issues as usual depends on the hardware
and your configuration. Maybe you need fatter pipes or more pipes to your
SAN, but if you're currently running multiple Guest images in the SAN
that'd likely be a major load by itself.

Another thing you can do is use a distro/OS with recent architecture. A
couple years ago leading several Linux distros starting moving many of
their mountpoints into tmpfs which means they're running in RAM instead of
hitting your hard drive. In fact, for those OS, if you don't store data in
local storage, after initial boot the disks aren't touched much. Of course,
this won't generally include the "Long Term" distro versions but IMO Java
apps like ES aren't likely subject to changes in the OS outside the JVM.

HTH,
Tony

On Wednesday, February 5, 2014 8:42:43 AM UTC-8, mooky wrote:

Thanks Tony.

I have had a few issues before with linux VM environment (ESX)

  • like single points of failure. SAN goes down, you lose all your
    servers.
  • like SAN controller gets saturated, and IOWait on all servers goes
    through the roof - everything runs veery slow.
  • I have heard other anecdotes regarding networking - and the ESX
    drivers not coping under heavy load. We certainly had loads of networking
    related problems - that many times we weren't able to get to the bottom -
    and the environment support personnel lacked the talent/resources to figure
    it out either.
  • many many problems due to oversubscription (but thats just down to the
    environment being run by muppets).

-N

On Wednesday, 5 February 2014 15:45:09 UTC, Tony Su wrote:

Although I'm running on a different VMware product, I've run other
things than ES on VMware before.
IMO is a good platform, no unusual issues.

Just the usual things...
If the VM was created on a different machine, then you may need to
verify no partition alignment issues.

In my current ES testing, most of my bottlenecks is the disk I/O.
Curiously, Marvel isn't reporting disk usage in the red, but it's obviously
the most loaded of the various resources being used (So, maybe it's not a
real bottleneck?). My current storage is shared SATA/SAS.

Like always, if you're dealing with SSD, about write
amplification and how to deal with it depending on whatever Guest OS you
use.

Tony

On Wednesday, February 5, 2014 7:22:08 AM UTC-8, mooky wrote:

What's the general opinion about running elasticsearch on VMWare ESX
VM's?
Are many people running elasticsearch reliably in this kind of env?
Assuming the VM is set up with dedicated/guaranteed mem/cpu, are there
any things to be concerned/worried about?

Disc IO Performance? SAN vs local HDD (local SSD).
Network IO Performance? driver issues ...

Many thanks

-N

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a5ad4e79-5e4b-45d5-9ba5-d3797b3cd5d9%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9d90a6-6c4a-40bb-a598-e2e676ec3506%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #7