Recommended File System Settings for Elasticsearch

Hey Guys,

we are going to expand our development ES-System to an productive
environment and we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch
reads/writes in 4k blocks, right? )

Thanks for response

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It really depends on your use, what you store and your queries.

Reads would be random.
Writes will depend on what you're doing, eg if you're doing logging then it
will be mostly sequential.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 18 July 2014 20:20, horst knete baduncle23@hotmail.de wrote:

Hey Guys,

we are going to expand our development ES-System to an productive
environment and we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch
reads/writes in 4k blocks, right? )

Thanks for response

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Yjji5zJ1Rhcrbynhd9u8tnYH%2BMTMfUE_5_2EkShCT62A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

thx for response.

We are going to build up a ELK-System so all ES will do is save the events
and make it searchable for Kibana - so nothing special on it.

Am Freitag, 18. Juli 2014 12:29:24 UTC+2 schrieb Mark Walkom:

It really depends on your use, what you store and your queries.

Reads would be random.
Writes will depend on what you're doing, eg if you're doing logging then
it will be mostly sequential.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 18 July 2014 20:20, horst knete <badun...@hotmail.de <javascript:>>
wrote:

Hey Guys,

we are going to expand our development ES-System to an productive
environment and we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch
reads/writes in 4k blocks, right? )

Thanks for response

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a70a0f8d-4ac5-467c-92e2-236a6591a782%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch does not set the read/write in 4k blocks, this is up to the
JVM/OS. Elastisearch uses Lucene and Lucene wraps all reads and write into
Java I/O streams, where Java uses the OS layer, and the OS uses the file
system drivers.

There are some rule of thumb: most important, don't mess with the block
size. Because the virtual memory page size of Linux is 4k, and
reading/writing pages from devices to memory should be fast, you will often
find 4k blocksize and a factor of 4k for buffers of random access workload
pattern. The Linux distributions come with good defaults. There is also
readahead buffering on fs driver layer for fast reads but these values
strongly depend on the filesystem and drive type. You can try if ES loads
Lucene indices files much faster if you increase the readahead buffering of
the fs but I doubt it. Try with tools like dd to get an impression of how
fast your file system works.

Elasticsearch/Lucene always appends data to the index like a log device.
There are no in-place updates. Old files are dropped after being copied to
new files.

Today the main choice is between SSDs and disk spindle drives. SSDs are
preferred because of the high IOPS numbers and bulk transfer rates, this
reduces the IO system load significantly.

Most important setting is RAID and the noatime parameter on fs mount to
disable file access modification time writing. This saves a lot of
unnecessary seeks and writes on spindle drives.

Best performance for Elasticsearch searches is dependent on RAM, not on
disks. The more RAM you have, the more files the OS is caching. To set up
mmapfs and give enough fs cache to Elasticsearch is one of the keys to
search performance.

Jörg

On Fri, Jul 18, 2014 at 12:20 PM, horst knete baduncle23@hotmail.de wrote:

Hey Guys,

we are going to expand our development ES-System to an productive
environment and we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch
reads/writes in 4k blocks, right? )

Thanks for response

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHTr%2BtASS-nXj%3DDfRaV8qSqPnieVMFdjJJjcJc40B-1bQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

2 Likes