What types of SSDs?

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of the
drive type seem to return in the same amount of time. So I suppose caching
is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch time,
refresh time etc... The Fusion IO fares a bit better then regular SSDs
using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but slightly
better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document size
    2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000 (all
    indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about 4,000,000
per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

As you mention, ES is pretty aggressive with its caching and higher
capacity IO helps with everything else.

Fusion is better than SSD and it's pretty awesome if you can afford it. But
when it comes to what type of SSD it's really a financial decision around
your capacity requirements. We run ES boxes with 24 SAS disk RAID10 and the
IO capabilites are good enough for what we need ES to do, so we aren't
looking to change that for now. But we have options to upgrade these to SSD
if the capacity is needed.

If you are at scale and have the processes to handle it, running something
like quanta gear with commodity SSDs (ie not "enterprise" branded ones)
makes a lot of sense. Then it's a matter of treating hardware as a real
throw away commodity and not worrying if you lose N units, which leveraged
along with ES's scalability and redundancy would be very effective.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 11 July 2014 06:13, John Smith java.dev.mtl@gmail.com wrote:

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of
the drive type seem to return in the same amount of time. So I suppose
caching is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch time,
refresh time etc... The Fusion IO fares a bit better then regular SSDs
using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but
slightly better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document size
    2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000
    (all indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about 4,000,000
per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624b-u6hK_gjFfkKZm7GxxTGx%2B71Wm5amW0KGnvzXNs5X3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Did you consider SSD with RAID0 (Linux, ext4, noatime) and SAS2 (6g/s) or
SAS3 (12g/s) controller?

I have for personal use at home LSI SAS 2008 of 4x128g SSD RAID0 with
sustained 800 MB/s write and 950 MB/s read, on a commodity dual AMD C32
socket server mainboard. I do not test with JMeter but on this single node
hardware alone I observe 15k bulk index operations per second, and
scan/scroll over 45m docs takes less than 70 min.

I'm waiting until SAS3 is affordable for me. For the future I have on my
list: LSI SAS 3008 HBA and SAS3 SSDs. For personal home use, Fusion IO is
too heavy for my wallet. Even for commercial purpose I do not consider it
as a cost effective solution.

Just a note: if you want spend your money to accelerate ES, buy RAM. You
will get more performance than from drives. Reason is the lower latency.
Low latency will speed up applications like ES more than the fastest I/O
drive is able to. That reminds me that I'm waiting since ages for DDR4
RAM...

Jörg

On Thu, Jul 10, 2014 at 10:13 PM, John Smith java.dev.mtl@gmail.com wrote:

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of
the drive type seem to return in the same amount of time. So I suppose
caching is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch time,
refresh time etc... The Fusion IO fares a bit better then regular SSDs
using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but
slightly better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document size
    2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000
    (all indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about 4,000,000
per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEnPy0QrjmSgSY0syCyt3N5gT4XFzDypE3mq85TFVjdvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Right now I have 4 boxes...

2x 32 cores 200GB RAM with RAID10 SATA1 + the Fusion IO

2x 24 cores 96GB RAM with RAID10 SAS but regular mechanical drives.

I only test them as pairs. So it's clusters of 2

On the surface all searches seem to perform quite close to each other. Only
when looking at the stats in HQ and Marvel the true story is told. For
instance most warnings with Fusion IO are yellow at best. While with the
SAS Raid 10 (Regular SATA Drives) they reach red.

I'm hopping I can get some regular SSDs to put on the SAS boxes and see if
it's better.

On Thursday, 10 July 2014 18:00:11 UTC-4, Jörg Prante wrote:

Did you consider SSD with RAID0 (Linux, ext4, noatime) and SAS2 (6g/s) or
SAS3 (12g/s) controller?

I have for personal use at home LSI SAS 2008 of 4x128g SSD RAID0 with
sustained 800 MB/s write and 950 MB/s read, on a commodity dual AMD C32
socket server mainboard. I do not test with JMeter but on this single node
hardware alone I observe 15k bulk index operations per second, and
scan/scroll over 45m docs takes less than 70 min.

I'm waiting until SAS3 is affordable for me. For the future I have on my
list: LSI SAS 3008 HBA and SAS3 SSDs. For personal home use, Fusion IO is
too heavy for my wallet. Even for commercial purpose I do not consider it
as a cost effective solution.

Just a note: if you want spend your money to accelerate ES, buy RAM. You
will get more performance than from drives. Reason is the lower latency.
Low latency will speed up applications like ES more than the fastest I/O
drive is able to. That reminds me that I'm waiting since ages for DDR4
RAM...

Jörg

On Thu, Jul 10, 2014 at 10:13 PM, John Smith <java.d...@gmail.com
<javascript:>> wrote:

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of
the drive type seem to return in the same amount of time. So I suppose
caching is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch time,
refresh time etc... The Fusion IO fares a bit better then regular SSDs
using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but
slightly better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document size
    2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000
    (all indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about 4,000,000
per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/13e20470-a38e-4d89-be98-5d6e26b0f0aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You could setup a hot and cold based allocation system, put your highly
accessed (hot) indexes on the SSDs and then the rest on the spinning disk.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 11 July 2014 23:35, John Smith java.dev.mtl@gmail.com wrote:

Right now I have 4 boxes...

2x 32 cores 200GB RAM with RAID10 SATA1 + the Fusion IO

2x 24 cores 96GB RAM with RAID10 SAS but regular mechanical drives.

I only test them as pairs. So it's clusters of 2

On the surface all searches seem to perform quite close to each other.
Only when looking at the stats in HQ and Marvel the true story is told. For
instance most warnings with Fusion IO are yellow at best. While with the
SAS Raid 10 (Regular SATA Drives) they reach red.

I'm hopping I can get some regular SSDs to put on the SAS boxes and see if
it's better.

On Thursday, 10 July 2014 18:00:11 UTC-4, Jörg Prante wrote:

Did you consider SSD with RAID0 (Linux, ext4, noatime) and SAS2 (6g/s)
or SAS3 (12g/s) controller?

I have for personal use at home LSI SAS 2008 of 4x128g SSD RAID0 with
sustained 800 MB/s write and 950 MB/s read, on a commodity dual AMD C32
socket server mainboard. I do not test with JMeter but on this single node
hardware alone I observe 15k bulk index operations per second, and
scan/scroll over 45m docs takes less than 70 min.

I'm waiting until SAS3 is affordable for me. For the future I have on my
list: LSI SAS 3008 HBA and SAS3 SSDs. For personal home use, Fusion IO is
too heavy for my wallet. Even for commercial purpose I do not consider it
as a cost effective solution.

Just a note: if you want spend your money to accelerate ES, buy RAM. You
will get more performance than from drives. Reason is the lower latency.
Low latency will speed up applications like ES more than the fastest I/O
drive is able to. That reminds me that I'm waiting since ages for DDR4
RAM...

Jörg

On Thu, Jul 10, 2014 at 10:13 PM, John Smith java.d...@gmail.com wrote:

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of
the drive type seem to return in the same amount of time. So I suppose
caching is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch
time, refresh time etc... The Fusion IO fares a bit better then regular
SSDs using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but
slightly better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document
    size 2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000
    (all indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about 4,000,000
per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/13e20470-a38e-4d89-be98-5d6e26b0f0aa%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/13e20470-a38e-4d89-be98-5d6e26b0f0aa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZGLqqPhK85TEvVtq%3DbO4g1VLrwuD%3DZphp2tdi71bFoog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

So I got my server with SAS

It's an HP DL 380P G7
2 x6 (Hyperthreaded) 24 cores 72GB RAM and 5 Intel 530 SSDs (RAID 10)

These are the stats while JMeter is pushing 3,500 indexing operations/sec
Average documents size is 2,500 bytes.

Indexing - Index:1.98msIndexing - Delete:0msSearch - Query:9.81msSearch -
Fetch:0.62msGet - Total:0msGet - Exists:0msGet - Missing:0msRefresh:215.91ms
Flush:532.62ms

On Friday, 11 July 2014 19:29:17 UTC-4, Mark Walkom wrote:

You could setup a hot and cold based allocation system, put your highly
accessed (hot) indexes on the SSDs and then the rest on the spinning disk.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 11 July 2014 23:35, John Smith <java.d...@gmail.com <javascript:>>
wrote:

Right now I have 4 boxes...

2x 32 cores 200GB RAM with RAID10 SATA1 + the Fusion IO

2x 24 cores 96GB RAM with RAID10 SAS but regular mechanical drives.

I only test them as pairs. So it's clusters of 2

On the surface all searches seem to perform quite close to each other.
Only when looking at the stats in HQ and Marvel the true story is told. For
instance most warnings with Fusion IO are yellow at best. While with the
SAS Raid 10 (Regular SATA Drives) they reach red.

I'm hopping I can get some regular SSDs to put on the SAS boxes and see
if it's better.

On Thursday, 10 July 2014 18:00:11 UTC-4, Jörg Prante wrote:

Did you consider SSD with RAID0 (Linux, ext4, noatime) and SAS2 (6g/s)
or SAS3 (12g/s) controller?

I have for personal use at home LSI SAS 2008 of 4x128g SSD RAID0 with
sustained 800 MB/s write and 950 MB/s read, on a commodity dual AMD C32
socket server mainboard. I do not test with JMeter but on this single node
hardware alone I observe 15k bulk index operations per second, and
scan/scroll over 45m docs takes less than 70 min.

I'm waiting until SAS3 is affordable for me. For the future I have on my
list: LSI SAS 3008 HBA and SAS3 SSDs. For personal home use, Fusion IO is
too heavy for my wallet. Even for commercial purpose I do not consider it
as a cost effective solution.

Just a note: if you want spend your money to accelerate ES, buy RAM. You
will get more performance than from drives. Reason is the lower latency.
Low latency will speed up applications like ES more than the fastest I/O
drive is able to. That reminds me that I'm waiting since ages for DDR4
RAM...

Jörg

On Thu, Jul 10, 2014 at 10:13 PM, John Smith java.d...@gmail.com
wrote:

Using 1.2.1

I know each system and functionality is different but just curious when
people say buy SSDs for ES, what types of SSDs are they buying?

Fortunately for me I had some Fusion IO cards to test with, but just
wondering if it's worth the price and if I should look into off the shelf
SSDs like Samsung EVOs using SAS instead of pure SATA.

So far from my testing it seems that all search operation regardless of
the drive type seem to return in the same amount of time. So I suppose
caching is playing a huge part here.

Though when looking at the HQ indexing stats like query time, fetch
time, refresh time etc... The Fusion IO fares a bit better then regular
SSDs using SATA.

For instance refresh time for Fusion IO is 250ms while for regular SSDs
(SATA NOT SAS, will test SAS when I get a chance) it's just above 1 second.
Even with fusion IO I do see some warnings on the index stats, but
slightly better then regular SSDs

Some strategies I picked for my indexes...

  • New index per day, plus routing by "user"
  • New index per day for monster users.

Using JMeter to test...

  • Achieved 3,500 index operations per second (Not bulk) avg document
    size 2,500 bytes (Fusion IO seemed to perform a bit better)
  • Created a total of 25 indexes totaling over 100,000,000 documents
    anywhere between 3,000,000 to 5,000,000 documents per index.
  • Scroll query to retrieve 15,000,000 documents out of the 100,000,000
    (all indexes) took 25 minutes regardless of drive type.

P.s: I want to index 2,000,000,000 documents per year so about
4,000,000 per day. So you can see why Fusion IO could be expensive :slight_smile:

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/24928d08-6354-4661-8164-9ff665709285%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/13e20470-a38e-4d89-be98-5d6e26b0f0aa%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/13e20470-a38e-4d89-be98-5d6e26b0f0aa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8849e716-ee62-4428-873b-c95a02c94c9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.