Best pratices for Ops guys

I really been looking for one place that gives some great config settings
changes suggestions for us ops guys and how to setup an elasticsearch
cluster cpu to shard ratios how many replicas to have. IOPS to disk size
ratio that sort of intresting stuff.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I don't know if this helps you with all of these concerns, but I found it
tremendously useful:

On Thursday, 16 May 2013 15:30:49 UTC-4, Wojons Tech wrote:

I really been looking for one place that gives some great config settings
changes suggestions for us ops guys and how to setup an elasticsearch
cluster cpu to shard ratios how many replicas to have. IOPS to disk size
ratio that sort of intresting stuff.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I have done a little experimenting in that area but the material is
quite large for my tight time schedule to write it up in a reasonable
blog post or something, I know it's awful.

Here is the essence of my thoughts to your questions.

CPU to shard ratio: a shard is a Lucene index, so the answer depends on
the size of the shards you create and the power of the CPU for Lucene
queries. I recommend shard sizes around 1 GB so they can be handled by
replication / recovery in a few seconds. The network bandwidth must be
there for that. Once the cluster is up, check your query load
distribution. Then my rule of thumb is "one shard per CPU core". The
basic idea is, you can put load on the CPU and it can execute requests
on every shard on the node without delay. In reality, Lucene is
multithreaded, and not every shard is involved in queries, so you can
put more shards on a node. A node should be able to serve around
100-1000 qps with current CPU cores and network interfaces (total
turnaround time with all the latencies, I measured for average random
term queries of 250-500 qps on an AMD Opteron 4170 six core - YMMV with
large docs and mappings). Advantage of ES is, you can take bread and
butter CPUs out there and spread it over many machines very easily by
adding nodes to the ES cluster, without compromising the response time.
A performance factor might be different shard sizes on a node according
to the index sizes, but there is not much you can do about that, unless
you have control about all the indexes are equally sized, which is far
from reailty.

IOPS to disk size ratio: you won't have to care much about disk size as
they are today if you have a suitable SAS2/SATA3 RAID controller
(6Gbit/s) and enough PCIe lanes for disk transfer. That is, use disk
sizes you can get that fit to your HBA (currently available sizes are
500 GB and up) . Due to your layout of the ES cluster for massive
indexing, you may have to cope with write IO challenges, which often
originates in wrong balanced CPU / Disk performance ratio of the
hardware of a node. Read IO challenges are rare because you can ramp up
vast amount of RAM, and mmap the whole ES process. Plus, Java NIO is
well managed by OS filesystem (here, JRE 7 NIO2 provides more
interesting features than a JRE 6 NIO but that is another topic, mainly
for Netty). Adding disks to your system that are fast helps a lot. IOPS
does not matter anymore with SSD, they go through the roof. Run your ES
indexes on SSD and you want never go back to spindle disks. Small SSD
can be configured as RAID 0 for best write performance. I measured 1.5
GBytes reads per second and 800 GBytes writes per second with four 128G
Plextor M5S SSDs in a RAID 0 on an LSI 2008 HBA. There is no need to set
up RAID 1 or higher because of the built-in redundancy of ES replica
setting. If a disk subsystem fails, let the whole node happily fail -
you still have the cluster up and running.

If you have more questions, just ask

Jörg

Am 16.05.13 21:30, schrieb Wojons Tech:

I really been looking for one place that gives some great config
settings changes suggestions for us ops guys and how to setup an
elasticsearch cluster cpu to shard ratios how many replicas to have.
IOPS to disk size ratio that sort of intresting stuff. --

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you so much for the help normally i would point out, friends dont let
friends raid 0 but i can see your point with this. If your speaking about
the upto 1k qps per core at 1gig of disk space seems a little low what
happens when you start having indexes in the terbyte size?

On Fri, May 17, 2013 at 2:37 PM, Jörg Prante joergprante@gmail.com wrote:

I have done a little experimenting in that area but the material is quite
large for my tight time schedule to write it up in a reasonable blog post
or something, I know it's awful.

Here is the essence of my thoughts to your questions.

CPU to shard ratio: a shard is a Lucene index, so the answer depends on
the size of the shards you create and the power of the CPU for Lucene
queries. I recommend shard sizes around 1 GB so they can be handled by
replication / recovery in a few seconds. The network bandwidth must be
there for that. Once the cluster is up, check your query load distribution.
Then my rule of thumb is "one shard per CPU core". The basic idea is, you
can put load on the CPU and it can execute requests on every shard on the
node without delay. In reality, Lucene is multithreaded, and not every
shard is involved in queries, so you can put more shards on a node. A node
should be able to serve around 100-1000 qps with current CPU cores and
network interfaces (total turnaround time with all the latencies, I
measured for average random term queries of 250-500 qps on an AMD Opteron
4170 six core - YMMV with large docs and mappings). Advantage of ES is, you
can take bread and butter CPUs out there and spread it over many machines
very easily by adding nodes to the ES cluster, without compromising the
response time. A performance factor might be different shard sizes on a
node according to the index sizes, but there is not much you can do about
that, unless you have control about all the indexes are equally sized,
which is far from reailty.

IOPS to disk size ratio: you won't have to care much about disk size as
they are today if you have a suitable SAS2/SATA3 RAID controller (6Gbit/s)
and enough PCIe lanes for disk transfer. That is, use disk sizes you can
get that fit to your HBA (currently available sizes are 500 GB and up) .
Due to your layout of the ES cluster for massive indexing, you may have to
cope with write IO challenges, which often originates in wrong balanced CPU
/ Disk performance ratio of the hardware of a node. Read IO challenges are
rare because you can ramp up vast amount of RAM, and mmap the whole ES
process. Plus, Java NIO is well managed by OS filesystem (here, JRE 7 NIO2
provides more interesting features than a JRE 6 NIO but that is another
topic, mainly for Netty). Adding disks to your system that are fast helps a
lot. IOPS does not matter anymore with SSD, they go through the roof. Run
your ES indexes on SSD and you want never go back to spindle disks. Small
SSD can be configured as RAID 0 for best write performance. I measured 1.5
GBytes reads per second and 800 GBytes writes per second with four 128G
Plextor M5S SSDs in a RAID 0 on an LSI 2008 HBA. There is no need to set up
RAID 1 or higher because of the built-in redundancy of ES replica setting.
If a disk subsystem fails, let the whole node happily fail - you still have
the cluster up and running.

If you have more questions, just ask

Jörg

Am 16.05.13 21:30, schrieb Wojons Tech:

I really been looking for one place that gives some great config settings

changes suggestions for us ops guys and how to setup an elasticsearch
cluster cpu to shard ratios how many replicas to have. IOPS to disk size
ratio that sort of intresting stuff. --

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/**
topic/elasticsearch/**WKdcy5M8yX8/unsubscribe?hl=en-**UShttps://groups.google.com/d/topic/elasticsearch/WKdcy5M8yX8/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello,

As ES can handle a bunch of disks, no need to aggregate them as Raid0, just
give it a list of mount points (one for each disk), coma separated.

Best regards,

--
Damien
Le 18 mai 2013 02:27, "wojonstech" wojonstech@gmail.com a écrit :

Thank you so much for the help normally i would point out, friends dont
let friends raid 0 but i can see your point with this. If your speaking
about the upto 1k qps per core at 1gig of disk space seems a little low
what happens when you start having indexes in the terbyte size?

On Fri, May 17, 2013 at 2:37 PM, Jörg Prante joergprante@gmail.comwrote:

I have done a little experimenting in that area but the material is quite
large for my tight time schedule to write it up in a reasonable blog post
or something, I know it's awful.

Here is the essence of my thoughts to your questions.

CPU to shard ratio: a shard is a Lucene index, so the answer depends on
the size of the shards you create and the power of the CPU for Lucene
queries. I recommend shard sizes around 1 GB so they can be handled by
replication / recovery in a few seconds. The network bandwidth must be
there for that. Once the cluster is up, check your query load distribution.
Then my rule of thumb is "one shard per CPU core". The basic idea is, you
can put load on the CPU and it can execute requests on every shard on the
node without delay. In reality, Lucene is multithreaded, and not every
shard is involved in queries, so you can put more shards on a node. A node
should be able to serve around 100-1000 qps with current CPU cores and
network interfaces (total turnaround time with all the latencies, I
measured for average random term queries of 250-500 qps on an AMD Opteron
4170 six core - YMMV with large docs and mappings). Advantage of ES is, you
can take bread and butter CPUs out there and spread it over many machines
very easily by adding nodes to the ES cluster, without compromising the
response time. A performance factor might be different shard sizes on a
node according to the index sizes, but there is not much you can do about
that, unless you have control about all the indexes are equally sized,
which is far from reailty.

IOPS to disk size ratio: you won't have to care much about disk size as
they are today if you have a suitable SAS2/SATA3 RAID controller (6Gbit/s)
and enough PCIe lanes for disk transfer. That is, use disk sizes you can
get that fit to your HBA (currently available sizes are 500 GB and up) .
Due to your layout of the ES cluster for massive indexing, you may have to
cope with write IO challenges, which often originates in wrong balanced CPU
/ Disk performance ratio of the hardware of a node. Read IO challenges are
rare because you can ramp up vast amount of RAM, and mmap the whole ES
process. Plus, Java NIO is well managed by OS filesystem (here, JRE 7 NIO2
provides more interesting features than a JRE 6 NIO but that is another
topic, mainly for Netty). Adding disks to your system that are fast helps a
lot. IOPS does not matter anymore with SSD, they go through the roof. Run
your ES indexes on SSD and you want never go back to spindle disks. Small
SSD can be configured as RAID 0 for best write performance. I measured 1.5
GBytes reads per second and 800 GBytes writes per second with four 128G
Plextor M5S SSDs in a RAID 0 on an LSI 2008 HBA. There is no need to set up
RAID 1 or higher because of the built-in redundancy of ES replica setting.
If a disk subsystem fails, let the whole node happily fail - you still have
the cluster up and running.

If you have more questions, just ask

Jörg

Am 16.05.13 21:30, schrieb Wojons Tech:

I really been looking for one place that gives some great config

settings changes suggestions for us ops guys and how to setup an
elasticsearch cluster cpu to shard ratios how many replicas to have. IOPS
to disk size ratio that sort of intresting stuff. --

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/**
topic/elasticsearch/**WKdcy5M8yX8/unsubscribe?hl=en-**UShttps://groups.google.com/d/topic/elasticsearch/WKdcy5M8yX8/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I don't understand the 1 gig of disk space, but there is no problem with
TB size in ES. Just use your LVM and set up LVs for RAID0 PVs
http://www.redhat.com/magazine/009jul05/features/lvm2/ or use ZFS

Because my servers are always fully equipped with drives I don't think
about adding drives to LVM in a node. Adding nodes to ES is much easier.

Jörg

Am 18.05.13 02:27, schrieb wojonstech:

If your speaking about the upto 1k qps per core at 1gig of disk space
seems a little low what happens when you start having indexes in the
terbyte size?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just a note, you can do that too, but that's for cases where you have
disks of different sizes or free space, the data files are not
distributed equally over the mount points (and they are not guaranteeed
to grow equally distributed)

Jörg

Am 18.05.13 08:58, schrieb Damien Hardy:

As ES can handle a bunch of disks, no need to aggregate them as Raid0,
just give it a list of mount points (one for each disk), coma separated.

Be

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

@damien,
which option in the config file is it to be able to just list a set of
mount points, all the drives are the same size. assuming my index fits in
memory is there any really reason to use ssd drives?

@jorg
I was refeering to when you said "A node should be able to serve around
100-1000 qps" that seems pretty small to me what are the factors that
determin something like that.

On Sat, May 18, 2013 at 12:45 AM, Jörg Prante joergprante@gmail.com wrote:

Just a note, you can do that too, but that's for cases where you have
disks of different sizes or free space, the data files are not distributed
equally over the mount points (and they are not guaranteeed to grow equally
distributed)

Jörg

Am 18.05.13 08:58, schrieb Damien Hardy:

As ES can handle a bunch of disks, no need to aggregate them as Raid0,
just give it a list of mount points (one for each disk), coma separated.

Be

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/**
topic/elasticsearch/**WKdcy5M8yX8/unsubscribe?hl=en-**UShttps://groups.google.com/d/topic/elasticsearch/WKdcy5M8yX8/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

1000 qps means 1ms latency over the network which is quite usual for TCP
stack and JVM. Another factor is query complexity, boolean operations,
many terms, or queries with facets for example. Note you can add nodes
and you can scale qps rate.

Jörg

Am 18.05.13 16:24, schrieb wojonstech:

I was refeering to when you said "A node should be able to serve
around 100-1000 qps" that seems pretty small to me what are the
factors that determin something like that.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Okay I guess that makes sense I know other databases can do a lot of
queries per second but I guess your speaking about 1000 qps per core.
On May 18, 2013 10:47 AM, "Jörg Prante" joergprante@gmail.com wrote:

1000 qps means 1ms latency over the network which is quite usual for TCP
stack and JVM. Another factor is query complexity, boolean operations, many
terms, or queries with facets for example. Note you can add nodes and you
can scale qps rate.

Jörg

Am 18.05.13 16:24, schrieb wojonstech:

I was refeering to when you said "A node should be able to serve around
100-1000 qps" that seems pretty small to me what are the factors that
determin something like that.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/**
topic/elasticsearch/**WKdcy5M8yX8/unsubscribe?hl=en-**UShttps://groups.google.com/d/topic/elasticsearch/WKdcy5M8yX8/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jörg Prante wrote:

Just a note, you can do that too, but that's for cases where you
have disks of different sizes or free space, the data files are
not distributed equally over the mount points (and they are not
guaranteeed to grow equally distributed)

The distribution is not guaranteed to be equal, but the ones with
more free space are weighted more heavily.

-Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.