How many shards is to many shards per server on SSD?

burtonator · October 7, 2014, 7:07pm

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · October 7, 2014, 7:32pm

The number of shards is in not related to HDD or SSD.

SSD reduces vastly the latency of I/O operations, ES must no longer wait
for I/O seeks, reads, or writes, so the overall throughput is significantly
faster, which is noticeable while indexing bulks of data.

Jörg

On Tue, Oct 7, 2014 at 9:07 PM, Kevin Burton burtonator@gmail.com wrote:

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGx9PW7yGfC6z2J074rzryVDmtLNgSRYPddJ1dS%2BEMmxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

nik9000 · October 7, 2014, 7:37pm

We have hundreds of shards on machines with two SSDs each. Some are large
shards (20GB) but most are small (a couple MB). It works fine except for
some trouble how Elasticsearch picks what shards go where (hint: it doesn't
take shard size into account beyond not filling up the disks).

Nik

On Tue, Oct 7, 2014 at 3:32 PM, joergprante@gmail.com <joergprante@gmail.com

wrote:

The number of shards is in not related to HDD or SSD.

SSD reduces vastly the latency of I/O operations, ES must no longer wait
for I/O seeks, reads, or writes, so the overall throughput is significantly
faster, which is noticeable while indexing bulks of data.

Jörg

On Tue, Oct 7, 2014 at 9:07 PM, Kevin Burton burtonator@gmail.com wrote:

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGx9PW7yGfC6z2J074rzryVDmtLNgSRYPddJ1dS%2BEMmxg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGx9PW7yGfC6z2J074rzryVDmtLNgSRYPddJ1dS%2BEMmxg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3AC8cXL3Uzc3AofO3sD%2BmUeyq%3DrTDjv_isNV6_tKwRKA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

burtonator · October 7, 2014, 11:30pm

Perhaps. I just assumed that ES would keep each shard in its own lucene
index on disk. These files aren't free and mean that you have to fsync
them (eventually) and those requires seeks to write the data.

So more shards , should, in theory mean more seeks.

Else you would just create 1B shards.

On Tuesday, October 7, 2014 12:32:16 PM UTC-7, Jörg Prante wrote:

The number of shards is in not related to HDD or SSD.

SSD reduces vastly the latency of I/O operations, ES must no longer wait
for I/O seeks, reads, or writes, so the overall throughput is significantly
faster, which is noticeable while indexing bulks of data.

Jörg

On Tue, Oct 7, 2014 at 9:07 PM, Kevin Burton <burto...@gmail.com
<javascript:>> wrote:

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b7160306-0ce1-4e13-8523-93f6735a1fad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

burtonator · October 8, 2014, 1:41am

Hm... I suspect that the load isn't equally distributed. Am I right?

I should have mentioned with the same load on the same index with the same
query rate.

And if it's low load you can probably get away with it but once the load
scales it's not going to be super fun.

On Tuesday, October 7, 2014 12:38:11 PM UTC-7, Nikolas Everett wrote:

We have hundreds of shards on machines with two SSDs each. Some are large
shards (20GB) but most are small (a couple MB). It works fine except for
some trouble how Elasticsearch picks what shards go where (hint: it doesn't
take shard size into account beyond not filling up the disks).

Nik

On Tue, Oct 7, 2014 at 3:32 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

The number of shards is in not related to HDD or SSD.

SSD reduces vastly the latency of I/O operations, ES must no longer wait
for I/O seeks, reads, or writes, so the overall throughput is significantly
faster, which is noticeable while indexing bulks of data.

Jörg

On Tue, Oct 7, 2014 at 9:07 PM, Kevin Burton <burto...@gmail.com
<javascript:>> wrote:

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGx9PW7yGfC6z2J074rzryVDmtLNgSRYPddJ1dS%2BEMmxg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGx9PW7yGfC6z2J074rzryVDmtLNgSRYPddJ1dS%2BEMmxg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/408d8091-272b-4bea-8510-b68541cc578f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · October 8, 2014, 7:07am

With ES, you can go up to the bandwidth limit the OS allows for writing I/O
(if you disable throttling etc.)

This means, if you write to one shard, it can be as fast as writing to
thousands of shards in parallel in summary. There is an OS limit for file
system buffers so the more shards, the more RAM is recommended.

If OS restricts file descriptor limits and you want to write to shards, you
can estimate you need a peak of 100-200 file descriptors for active merges
etc. (this varies from version to version and from setting to setting). ES
does not impose a limit here.

These resource demands and custom configurations are not related to the
choice SSD or HDD.

Jörg

On Wed, Oct 8, 2014 at 1:30 AM, Kevin Burton burtonator@gmail.com wrote:

Perhaps. I just assumed that ES would keep each shard in its own lucene
index on disk. These files aren't free and mean that you have to fsync
them (eventually) and those requires seeks to write the data.

So more shards , should, in theory mean more seeks.

Else you would just create 1B shards.

On Tuesday, October 7, 2014 12:32:16 PM UTC-7, Jörg Prante wrote:

The number of shards is in not related to HDD or SSD.

SSD reduces vastly the latency of I/O operations, ES must no longer wait
for I/O seeks, reads, or writes, so the overall throughput is significantly
faster, which is noticeable while indexing bulks of data.

Jörg

On Tue, Oct 7, 2014 at 9:07 PM, Kevin Burton burto...@gmail.com wrote:

I'm curious how many shards would be too many shards per server... and
wanted feedback based on community experience.

On HDD you probably want to be a big conservative.... but on SSD I think
you could probably get away with a higher number.

10 seems to be somewhat reasonable I imagine. Not perfect but not bad
either. This way you could grow 10x...

PS. Really a shame that ES can't do shard splitting yet.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8c22ef66-5794-4c74-9410-5d83d3a1ee5d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b7160306-0ce1-4e13-8523-93f6735a1fad%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b7160306-0ce1-4e13-8523-93f6735a1fad%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFAgKVgd%2BgZsmudCBZGk9nY9g_0XZy%2BnhRL%2BsVzxP_-pg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

burtonator · October 9, 2014, 10:34pm

On Wednesday, October 8, 2014 12:07:30 AM UTC-7, Jörg Prante wrote:

With ES, you can go up to the bandwidth limit the OS allows for writing
I/O (if you disable throttling etc.)

This means, if you write to one shard, it can be as fast as writing to
thousands of shards in parallel in summary. There is an OS limit for file
system buffers so the more shards, the more RAM is recommended.

If OS restricts file descriptor limits and you want to write to shards,
you can estimate you need a peak of 100-200 file descriptors for active
merges etc. (this varies from version to version and from setting to
setting). ES does not impose a limit here.

These resource demands and custom configurations are not related to the
choice SSD or HDD.

There's no way that's true... for example, if you are on HDD, and you're
index is not in memory, AND you're serving queries (which is a realistic
use case) then there is NO way you can write at the full IO of the disk.
It's just physically impossible.

If ES has been able to solve that problem then they could win a nobel prize
:-p

SSD is 2-3 orders of magnitude faster than HDD ... so yes, it is related to
the choice of SSD or HDD.

... it's entirely possible I'm misinterpreting what you're saying though.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · October 10, 2014, 3:39pm

RAM is 1000x faster than SSD. Elasticsearch loads the index into RAM when
needed, so all searches are after loading index files into memory. Unless
your index does not fit into virtual memory, you can assume that searching
is done on a) RAM b) if not in RAM, you hit the file cache or c) you miss
the file cache and this requires disk seeks by the OS virtual memory
manager. With mlockall you avoid swap, so RAM pages are never dislocated to
a slow disk device and the performance is in a steady state.

I am not sure if that is nobel prize - I rather think this is due to my
hardware and software setup - but in my ES node it is quite usual to see
writes with >200 MB/sec to SSD while bulk indexing. I admit this is not the
value of maximum drive performance but there are other factors which are
out of scope of Elasticsearch (e.g. JVM buffer, file system journaling,
random writes vs. sequential writes, intertwining of reads and writes, fair
queue scheduling of the VM, bus speed, controller configuration, cache
settings etc.) so with normal methods, it can not go much higher.

Your question was about the number of shards for SSD. Really - there is no
different shard limit on an ES cluster whether with HDD or with SSD.

Maybe to wanted to ask about performance gains of SSD, then I misunderstood
the question completely.

Jörg

On Fri, Oct 10, 2014 at 12:34 AM, Kevin Burton burtonator@gmail.com wrote:

On Wednesday, October 8, 2014 12:07:30 AM UTC-7, Jörg Prante wrote:

With ES, you can go up to the bandwidth limit the OS allows for writing
I/O (if you disable throttling etc.)

This means, if you write to one shard, it can be as fast as writing to
thousands of shards in parallel in summary. There is an OS limit for file
system buffers so the more shards, the more RAM is recommended.

If OS restricts file descriptor limits and you want to write to shards,
you can estimate you need a peak of 100-200 file descriptors for active
merges etc. (this varies from version to version and from setting to
setting). ES does not impose a limit here.

These resource demands and custom configurations are not related to the
choice SSD or HDD.

There's no way that's true... for example, if you are on HDD, and you're
index is not in memory, AND you're serving queries (which is a realistic
use case) then there is NO way you can write at the full IO of the disk.
It's just physically impossible.

If ES has been able to solve that problem then they could win a nobel
prize :-p

SSD is 2-3 orders of magnitude faster than HDD ... so yes, it is related
to the choice of SSD or HDD.

... it's entirely possible I'm misinterpreting what you're saying though.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3b40a167-8f82-4f60-9833-8e3bce77370f%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEK-ECPT-qYgXDDcs5jPZ9W8tSkqEgF9S16TjDXNKBSzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

nik9000 · October 10, 2014, 4:31pm

On Thu, Oct 9, 2014 at 6:34 PM, Kevin Burton burtonator@gmail.com wrote:

On Wednesday, October 8, 2014 12:07:30 AM UTC-7, Jörg Prante wrote:

With ES, you can go up to the bandwidth limit the OS allows for writing
I/O (if you disable throttling etc.)

This means, if you write to one shard, it can be as fast as writing to
thousands of shards in parallel in summary. There is an OS limit for file
system buffers so the more shards, the more RAM is recommended.

If OS restricts file descriptor limits and you want to write to shards,
you can estimate you need a peak of 100-200 file descriptors for active
merges etc. (this varies from version to version and from setting to
setting). ES does not impose a limit here.

These resource demands and custom configurations are not related to the
choice SSD or HDD.

There's no way that's true... for example, if you are on HDD, and you're
index is not in memory, AND you're serving queries (which is a realistic
use case) then there is NO way you can write at the full IO of the disk.
It's just physically impossible.

If ES has been able to solve that problem then they could win a nobel
prize :-p

SSD is 2-3 orders of magnitude faster than HDD ... so yes, it is related
to the choice of SSD or HDD.

... it's entirely possible I'm misinterpreting what you're saying though.

Yeah. You could disable throttling and write faster. But that'd be
stupid if you need query performance to stay constant.

You asked about the load on my shards. Its not at all evenly distributed.
Update and query rate varies a ton depending on the index.

The are per-index overheads just for having that many indexes but from my
perspective they are pretty small. Some stuff isn't right, for example
Elasticsearch distributes the memory used to buffer writes evenly across
indexes that have received writes in the past few minutes.

Actually serving searches across all the shards is another story though.
For the most part we lay things out so that each search request hits 1/3 -
2 of our servers. We're probably more concerned about that then most
people, though, because we frequently perform actions that have high-ish
per shard overhead like the phrase suggester. Oversubscribing is ok for
indexes that don't get much query traffic but otherwise we make sure
everything is spread out as evenly as possible. We do that by cranking up
the *index *allocation factor and by adding total_shards_per_node = 1 to
our highest traffic indexes. Cranking up the index allocation spreads the
shards of each index out pretty evenly but sometimes Elasticsearch will
smash them together during "exciting" situations like when a node drops
out. total_shards_per_node is a hard limit so even when things get
exciting it'll keep those shards away from each other.

If your wondering about write speed its only kinda related to shards. The
two things you can do to speed up writing is to make sure that all shards
you are writing to are evenly spread out across your nodes and to crank up
the number of segments that are allowed to exist. More segments lower the
write amplification factor both for IO and CPU usage. BUT they cost more
to search. If you plan to do a ton of writes at once then never again you
can crank up the segments during the write and then run an optimize to
squash the segments together. That works great if you never update any of
the documents in the index. Its even ok to index more stuff into the
index. It starts to be annoying if you do lots of updates (I do) because
you get into situations where your big segments have lots of deleted
document in them. Then you have to merge them and that has lots of
overhead. If they get big enough the merge policy will sometimes refuse to
pick them for merging too! Its kind of a degenerate case though.

Looping back around to shards for write performance - the number of
segments allowed is a per shard thing. Its set on the whole index but it
applies per shard. Say the max segments per size tier is 10 and the number
of shards is 5. Then you get 50 segments per tier. So if you double the
number of shards you double the number of segments per tier so writes are
faster. The trouble is there is no command to squash shards together. So
maybe the better solution is to double the number of segments per tier and
then run optimize, because squashing shards together is optimize's job.
OTOH see the caveats above to optimize.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2kEhe%2B-YabODeFA-6cLY9BZrn7WC0K7CVuo689RoHMAg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

burtonator · October 10, 2014, 7:23pm

On Friday, October 10, 2014 8:39:43 AM UTC-7, Jörg Prante wrote:

RAM is 1000x faster than SSD.

I mean this is the big caveat isn't it?

If you can fit your whole index in RAM, then great, go for it... but others
have loads that can't it all in RAM at which point SSD becomes a decent
option.

Honestly for many workloads running it ALL out or SSD is an option.

This was what I was getting at...

To me, RAM based search isn't that interesting any more as even on fully
random IO you can saturate a gigabit link on modern SSDs... but if you're
doing crazy queries perhaps RAM is still required.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/46ba386f-0442-4a86-9c0d-bdd9f2fd8af3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · October 11, 2014, 7:23am

RAM is cheap and the best method to achieve highest performance in ES.

You can try search benchmarks on ES clusters with > 8 GB RAM and > 4 GB
heap plus mlockall. Either with SSD or HDD you will see only tiny
difference in response times after warmup. If you are only after search
performance, the cost for replacing HDD with SSD is so high and the
performance gain at search side is so small that it is not worth the effort.

If the index grows and exceeds the cluster capacity, you should add RAM or
add nodes to grow the capacity. If you do not, a cluster will get more and
more sluggish and GC overhead increases. SSD does not change that. The
situation when you want constant search response times while you have
growing data in your index does not depend on a question about using SSD or
HDD.

In ES, SSD is best for low latency for index store write access and high
data throughput while bulk indexing or for retrieving large result sets
with scan/scroll. If you are after this, SSD is a must.

Jörg

On Fri, Oct 10, 2014 at 9:23 PM, Kevin Burton burtonator@gmail.com wrote:

On Friday, October 10, 2014 8:39:43 AM UTC-7, Jörg Prante wrote:

RAM is 1000x faster than SSD.

I mean this is the big caveat isn't it?

If you can fit your whole index in RAM, then great, go for it... but
others have loads that can't it all in RAM at which point SSD becomes a
decent option.

Honestly for many workloads running it ALL out or SSD is an option.

This was what I was getting at...

To me, RAM based search isn't that interesting any more as even on fully
random IO you can saturate a gigabit link on modern SSDs... but if you're
doing crazy queries perhaps RAM is still required.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/46ba386f-0442-4a86-9c0d-bdd9f2fd8af3%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/46ba386f-0442-4a86-9c0d-bdd9f2fd8af3%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGjQusyMezzvo5e4GYeQ33dJkPyWsZHNusrn2oWjXxX7g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

burtonator · October 12, 2014, 10:23pm

On Saturday, October 11, 2014 12:23:35 AM UTC-7, Jörg Prante wrote:

RAM is cheap and the best method to achieve highest performance in ES.

You know what's cheaper thank RAM? SSD

And if you have a modern SSD and a reasonable CPU with a moderate amount of
RAM, you can easily saturate a gigabit port.

Kevin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/55c16758-c0d3-4eae-a3f8-3eb9c3e3aebf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
What is the best way to distribute nodes and shards in Elasticsearch to achieve fast search while storing recent data on SSD and older data on HDD? Elasticsearch	27	213	August 17, 2025
Elasticsearch performance in HDD vs SSD and 32 GB vs 64 GB of RAM Elasticsearch	25	4427	June 30, 2023
Random Access & Performance Elasticsearch	12	1690	July 6, 2017
Slow Query Performance Elasticsearch	10	799	July 6, 2017
Figuring out the optimal number of shards Elasticsearch	6	1658	July 6, 2017

How many shards is to many shards per server on SSD?

Related topics