Elasticsearch IO characteristics on SSD Vs HDD

Hi!
We are trying to compare between SSD and HDD for our Elasticsearch use case.
In one of the scenarios, we want to see how fast we can Open an index.
Index size is around 4.5 - 5GB.
For SSD, we observe that there seem to be more block IO operations being done (with bigger read size).
For uniformity, we merge the index into 1 segment.
We are using https://github.com/brendangregg/perf-tools/blob/master/iosnoop
on SSD, we see a lot of IO with size around 512kb

STARTs          ENDs            COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
535881.595815   535881.605989   java         25538  R    8,16     769839040    516096    10.17
535881.595875   535881.606904   java         25538  R    8,16     769840048    516096    11.03
535881.595937   535881.607795   java         25538  R    8,16     769841056    516096    11.86
535881.595958   535881.608261   java         25538  R    8,16     769842064    516096    12.30
535881.595963   535881.608297   java         25538  R    8,16     769843072    65536     12.33

whereby for HDD, relatively having less IO with smaller size reported

STARTs          ENDs            COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
535219.823665   535219.824391   java         24474  R    8,48     798720       131072     0.73
535219.824496   535219.824791   java         24474  R    8,48     798976       131072     0.30
535219.824633   535219.826032   java         24474  R    8,48     958464       131072     1.40
535219.824938   535219.827055   java         24474  R    8,48     958720       131072     2.12
535219.826513   535219.827153   java         24474  R    8,48     958976       131072     0.64

We understand that in real scenarios there will be more segments and in this case there will be more random IO needed, hence less performance on HDD.
But here we are curious on why SSD seems to incur bigger IO size during OpenIndex.
Does Elasticsearch has different logic when writing into SSD and HDD?
or I might be missing some OS-level configuration here?

Thank you in advance!

Elasticsearch itself does not detect whether it's running against an SSD or HDD and alter its behaviour accordingly in any way.

Likely this is an OS level thing yes. It could be that the OS is doing bigger physical reads when reading a chunk of a (memory mapped) file on the SSD for example. This is really system specific though and it's hard for me to guess where the different behaviour is coming from specifically.
It should not be coming from Elasticsearch directly though.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.