I am preparing proposals on hardware for our Elasticsearch log storage.
What I would love to have are SSD's for most recent logs or SSD's for hot
data. For that I have come down to two solutions with 3x physical servers.
Use Windows 2012 R2 as the OS, use Storage Spaces to prvide a tiered
storage for SSD's and HDD's to Elasticsearch.
Install ESXi. Create two VM's on each server, one that has access to SSD
disks and one who has access to HDD's. That will give me a 6 node
Elasticsearch cluster. Use tags to keep latest indices on the SSD nodes,
when they are x days old a script or curator will remove the SSD tag from
the index and add a HDD tag, hopefully resulting in a migration to the HDD
nodes. Either Windows 2012 R2 or Ubuntu will be used here.
Each Elasticsearch node will get either 32gig memory or 64gig memory.
Undecided on that at the moment, might go with the lower amount to have the
option of expanding it if there is need. With 192gigs in the cluster there
might not even be a need for SSD's.
Not sure about the number of cores.
Plan to skip raid on everything except the OS disks and use Elasticsearch
striping for the data. Total storage will be about 20gigs a day without
replication, and total storage will be about 15tb.
Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it is
good at, otherwise you just have more management overhead and greater risk
of a hardware failure causing a bigger problem.
The rest of your setup looks sane, gong with VMs adds a minimal performance
loss but gives you much more flexibility, and I'd start with the lower
amount of RAM as you mentioned.
On 4 December 2014 at 22:27, Elvar Böðvarsson elvarb@gmail.com wrote:
I am preparing proposals on hardware for our Elasticsearch log storage.
What I would love to have are SSD's for most recent logs or SSD's for hot
data. For that I have come down to two solutions with 3x physical servers.
Use Windows 2012 R2 as the OS, use Storage Spaces to prvide a tiered
storage for SSD's and HDD's to Elasticsearch.
Install ESXi. Create two VM's on each server, one that has access to
SSD disks and one who has access to HDD's. That will give me a 6 node
Elasticsearch cluster. Use tags to keep latest indices on the SSD nodes,
when they are x days old a script or curator will remove the SSD tag from
the index and add a HDD tag, hopefully resulting in a migration to the HDD
nodes. Either Windows 2012 R2 or Ubuntu will be used here.
Each Elasticsearch node will get either 32gig memory or 64gig memory.
Undecided on that at the moment, might go with the lower amount to have the
option of expanding it if there is need. With 192gigs in the cluster there
might not even be a need for SSD's.
Not sure about the number of cores.
Plan to skip raid on everything except the OS disks and use Elasticsearch
striping for the data. Total storage will be about 20gigs a day without
replication, and total storage will be about 15tb.
Saying that RAID is good for anything is a bit of a stretch
I'm not sure how good ES is with splitting the index across volumes but the
database has a lot more options here for load distribution. RAID is naive
by design and the optimizations a RAID controller/impl are limited.
If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.
On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:
Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.
RAID is useful, you just need to understand the limits. And the potential
for data loss with multiple ES nodes writing to multiple data directories
is not inconsequential if it's an important system with business
requirements.
To reiterate because it's really important this is known - if you lose one
of the data.dir points on a node you lose all data on the node. The ES
dev team has had talks about improving this so things are written on a
segment level rather than a direct stripe, but there's no ETA on that that
I am aware of.
ES will obviously be limited by the OS/FS/hardware/etc throughput of any
given channel, I haven't seen anyone do testing of RAID0 versus ES striping
though so it's an interesting question.
Saying that RAID is good for anything is a bit of a stretch
I'm not sure how good ES is with splitting the index across volumes but
the database has a lot more options here for load distribution. RAID is
naive by design and the optimizations a RAID controller/impl are limited.
If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.
On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:
Be aware that using multiple data locations in ES is akin to RAID0; which
means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.
The good thing if you use a raid controller for RAID0 is that you get all
the queuing and buffering features. What I will end up using in this case
will most likely go down to cost and recovery options in case of a failure.
Might even go with other Raid levels, 1, 10 or 6.
Regarding the SSD options, what would be the best way to go about doing
that?
Also, has anyone done a performance comparison between Windows VS. Linux
for Elasticsearch?
I have read articles, old ones, that state that JVM performance is better
on Windows
Went to a high performance session at VMworld regarding JVM performance,
the speaker did not want to go into the pit of discussing witch was better.
He did though point out that JVM in Windows works in 1mb memory chunks
while Linux JVM uses 256k chunks.
On Friday, December 5, 2014 1:57:27 AM UTC, Mark Walkom wrote:
RAID is useful, you just need to understand the limits. And the potential
for data loss with multiple ES nodes writing to multiple data directories
is not inconsequential if it's an important system with business
requirements.
To reiterate because it's really important this is known - if you lose one
of the data.dir points on a node you lose all data on the node. The ES
dev team has had talks about improving this so things are written on a
segment level rather than a direct stripe, but there's no ETA on that that
I am aware of.
ES will obviously be limited by the OS/FS/hardware/etc throughput of any
given channel, I haven't seen anyone do testing of RAID0 versus ES striping
though so it's an interesting question.
On 5 December 2014 at 12:11, Kevin Burton <burto...@gmail.com
<javascript:>> wrote:
Saying that RAID is good for anything is a bit of a stretch
I'm not sure how good ES is with splitting the index across volumes but
the database has a lot more options here for load distribution. RAID is
naive by design and the optimizations a RAID controller/impl are limited.
If ES can outperform RAID0 zero then that might be the better route ...
just not sure how good ES is at it though.
On Thursday, December 4, 2014 2:42:25 PM UTC-8, Mark Walkom wrote:
Be aware that using multiple data locations in ES is akin to RAID0;
which means if you lose a disk then you lose all the data on that node.
Personally, I'd suggest you leverage hardware RAID and let it do what it
is good at, otherwise you just have more management overhead and greater
risk of a hardware failure causing a bigger problem.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.