Elasticsearch hardware planning

I am preparing proposals on hardware for our Elasticsearch log storage.

What I would love to have are SSD's for most recent logs or SSD's for hot
data. For that I have come down to two solutions with 3x physical servers.

  1. Use Windows 2012 R2 as the OS, use Storage Spaces to prvide a tiered
    storage for SSD's and HDD's to Elasticsearch.

  2. Install ESXi. Create two VM's on each server, one that has access to SSD
    disks and one who has access to HDD's. That will give me a 6 node
    Elasticsearch cluster. Use tags to keep latest indices on the SSD nodes,
    when they are x days old a script or curator will remove the SSD tag from
    the index and add a HDD tag, hopefully resulting in a migration to the HDD
    nodes. Either Windows 2012 R2 or Ubuntu will be used here.

Each Elasticsearch node will get either 32gig memory or 64gig memory.
Undecided on that at the moment, might go with the lower amount to have the
option of expanding it if there is need. With 192gigs in the cluster there
might not even be a need for SSD's.

Not sure about the number of cores.

Plan to skip raid on everything except the OS disks and use Elasticsearch
striping for the data. Total storage will be about 20gigs a day without
replication, and total storage will be about 15tb.

Any angles I'm missing?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1e6bbc44-fc3b-4724-b657-63722a951d06%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it is
good at, otherwise you just have more management overhead and greater risk
of a hardware failure causing a bigger problem.

The rest of your setup looks sane, gong with VMs adds a minimal performance
loss but gives you much more flexibility, and I'd start with the lower
amount of RAM as you mentioned.

On 4 December 2014 at 22:27, Elvar Böðvarsson elvarb@gmail.com wrote:

I am preparing proposals on hardware for our Elasticsearch log storage.

What I would love to have are SSD's for most recent logs or SSD's for hot
data. For that I have come down to two solutions with 3x physical servers.

  1. Use Windows 2012 R2 as the OS, use Storage Spaces to prvide a tiered
    storage for SSD's and HDD's to Elasticsearch.

  2. Install ESXi. Create two VM's on each server, one that has access to
    SSD disks and one who has access to HDD's. That will give me a 6 node
    Elasticsearch cluster. Use tags to keep latest indices on the SSD nodes,
    when they are x days old a script or curator will remove the SSD tag from
    the index and add a HDD tag, hopefully resulting in a migration to the HDD
    nodes. Either Windows 2012 R2 or Ubuntu will be used here.

Each Elasticsearch node will get either 32gig memory or 64gig memory.
Undecided on that at the moment, might go with the lower amount to have the
option of expanding it if there is need. With 192gigs in the cluster there
might not even be a need for SSD's.

Not sure about the number of cores.

Plan to skip raid on everything except the OS disks and use Elasticsearch
striping for the data. Total storage will be about 20gigs a day without
replication, and total storage will be about 15tb.

Any angles I'm missing?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1e6bbc44-fc3b-4724-b657-63722a951d06%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1e6bbc44-fc3b-4724-b657-63722a951d06%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8sOZvre9Mbb8%3DU%2B2xxGnede4hdL_SRKeMV95uFRDL0Fw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Saying that RAID is good for anything is a bit of a stretch :stuck_out_tongue:

I'm not sure how good ES is with splitting the index across volumes but the
database has a lot more options here for load distribution. RAID is naive
by design and the optimizations a RAID controller/impl are limited.

If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.

On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:

Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68a76c15-e363-4d01-9133-bdebd8797a21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

RAID is useful, you just need to understand the limits. And the potential
for data loss with multiple ES nodes writing to multiple data directories
is not inconsequential if it's an important system with business
requirements.
To reiterate because it's really important this is known - if you lose one
of the data.dir points on a node you lose all data on the node. The ES
dev team has had talks about improving this so things are written on a
segment level rather than a direct stripe, but there's no ETA on that that
I am aware of.

ES will obviously be limited by the OS/FS/hardware/etc throughput of any
given channel, I haven't seen anyone do testing of RAID0 versus ES striping
though so it's an interesting question.

On 5 December 2014 at 12:11, Kevin Burton burtonator@gmail.com wrote:

Saying that RAID is good for anything is a bit of a stretch :stuck_out_tongue:

I'm not sure how good ES is with splitting the index across volumes but
the database has a lot more options here for load distribution. RAID is
naive by design and the optimizations a RAID controller/impl are limited.

If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.

On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:

Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/68a76c15-e363-4d01-9133-bdebd8797a21%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/68a76c15-e363-4d01-9133-bdebd8797a21%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9S9toOpaqzkjQPLg-cndKG4fZ7vsuvTuANGEdUAugZig%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

The good thing if you use a raid controller for RAID0 is that you get all
the queuing and buffering features. What I will end up using in this case
will most likely go down to cost and recovery options in case of a failure.
Might even go with other Raid levels, 1, 10 or 6.

Read this
article, https://codeascraft.com/2014/12/04/juggling-multiple-elasticsearch-instances-on-a-single-host/
that brings up a very interesting point. To split elasticsearch processes
into 8gb chunks.

Regarding the SSD options, what would be the best way to go about doing
that?

Also, has anyone done a performance comparison between Windows VS. Linux
for Elasticsearch?

  • I have read articles, old ones, that state that JVM performance is better
    on Windows
  • Went to a high performance session at VMworld regarding JVM performance,
    the speaker did not want to go into the pit of discussing witch was better.
    He did though point out that JVM in Windows works in 1mb memory chunks
    while Linux JVM uses 256k chunks.

On Friday, December 5, 2014 1:57:27 AM UTC, Mark Walkom wrote:

RAID is useful, you just need to understand the limits. And the potential
for data loss with multiple ES nodes writing to multiple data directories
is not inconsequential if it's an important system with business
requirements.
To reiterate because it's really important this is known - if you lose one
of the data.dir points on a node you lose all data on the node. The ES
dev team has had talks about improving this so things are written on a
segment level rather than a direct stripe, but there's no ETA on that that
I am aware of.

ES will obviously be limited by the OS/FS/hardware/etc throughput of any
given channel, I haven't seen anyone do testing of RAID0 versus ES striping
though so it's an interesting question.

On 5 December 2014 at 12:11, Kevin Burton <burto...@gmail.com
<javascript:>> wrote:

Saying that RAID is good for anything is a bit of a stretch :stuck_out_tongue:

I'm not sure how good ES is with splitting the index across volumes but
the database has a lot more options here for load distribution. RAID is
naive by design and the optimizations a RAID controller/impl are limited.

If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.

On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:

Be aware that using multiple data locations in ES is akin to RAID0;
which means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/68a76c15-e363-4d01-9133-bdebd8797a21%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/68a76c15-e363-4d01-9133-bdebd8797a21%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9b1295b9-1155-4761-beb6-4ef4d30026b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.