Any experience with ES and Data Compressing Filesystems?


(Horst Birne) #1

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

There's a few previous threads on this topic in the archives, though I
don't immediately recall seeing any performance metrics unfortunately.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 16 July 2014 20:56, horst knete baduncle23@hotmail.de wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624anydS9-aNyDYUXz3RgtSCYJn1XUTEzKyFUiNUJr8hrbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #3

You will not gain much advantage because ES already compresses data on disk
with LZF, ZFS is using LZ4, which compression output is quite similar. In
the file system statistics you will notice the compression ratio, and this
will be no good value. So instead of having ZFS trying to compress where
not much can be gained, you should switch it off.

Jörg

On Wed, Jul 16, 2014 at 12:56 PM, horst knete baduncle23@hotmail.de wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGtZicXu8vLe9oBG8bKS3rLp771_chUXjLg5E2m%2BHSCJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

Ups, not true, Elasticsearch uses Lucene codec compression, and this is
also LZ4 (LZF only for backwards compatibility)

Here are some numbers:

Jörg

On Wed, Jul 16, 2014 at 2:28 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You will not gain much advantage because ES already compresses data on
disk with LZF, ZFS is using LZ4, which compression output is quite similar.
In the file system statistics you will notice the compression ratio, and
this will be no good value. So instead of having ZFS trying to compress
where not much can be gained, you should switch it off.

Jörg

On Wed, Jul 16, 2014 at 12:56 PM, horst knete baduncle23@hotmail.de
wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF%3D%3DEumbfUWcr0VyN4frFvAFqv8jHTmP%3DtBKB9jW%3D0oOQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Otis Gospodnetić) #5

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should you
try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a92ce201-a228-407d-a9d4-613125488454%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Horst Birne) #6

Hey guys,

we have mounted an btrfs file system with the compression method "zlib" for
testing purposes on our elasticsearchserver and copied one of the indices
on the btrfs volume, unfortunately it had no success and still got the size
of 50gb :confused:

I will further try it with other compression methods and will report here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should
you try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fab716e-dcef-4edf-b658-56922f8dee16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Patrick Proniewski) #7

Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does show some benefit:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0% /zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0% /zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete baduncle23@hotmail.de wrote:

Hey guys,

we have mounted an btrfs file system with the compression method "zlib" for
testing purposes on our elasticsearchserver and copied one of the indices
on the btrfs volume, unfortunately it had no success and still got the size
of 50gb :confused:

I will further try it with other compression methods and will report here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should
you try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3DD72EC1-E3EC-493D-94DD-33E63151A579%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


(Horst Birne) #8

Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the
amount of data from 3,2 TB to 1,1TB without any segnificant performance
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

Hi,

gzip/zlib compression is very bad for performance, so it can be
interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k
blocks and you add FS compression, you will probably have a very small
gain, if any. I've tried on ZFS:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem
does show some benefit:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read
or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete <badun...@hotmail.de <javascript:>>
wrote:

Hey guys,

we have mounted an btrfs file system with the compression method "zlib"
for
testing purposes on our elasticsearchserver and copied one of the
indices
on the btrfs volume, unfortunately it had no success and still got the
size
of 50gb :confused:

I will further try it with other compression methods and will report
here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should
you try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression
file

system, which allows us transparent compression for the es-indices.
(It

seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS
or

ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #9

What sort of data are you indexing? When you said performance impact was
minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 4 August 2014 16:43, horst knete baduncle23@hotmail.de wrote:

Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the
amount of data from 3,2 TB to 1,1TB without any segnificant performance
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

Hi,

gzip/zlib compression is very bad for performance, so it can be
interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses
4k blocks and you add FS compression, you will probably have a very small
gain, if any. I've tried on ZFS:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem
does show some benefit:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read
or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de wrote:

Hey guys,

we have mounted an btrfs file system with the compression method "zlib"
for
testing purposes on our elasticsearchserver and copied one of the
indices
on the btrfs volume, unfortunately it had no success and still got the
size
of 50gb :confused:

I will further try it with other compression methods and will report
here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but
should

you try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression
file

system, which allows us transparent compression for the es-indices.
(It

seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS
or

ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ayiQiTBeiNkn5NKRqm2Vyu5Bk2md8Gw%2BC7nSVJXF9%2BGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Horst Birne) #10

We are indexing all sort of events (Windows, Linux, Apache, Netflow and so
on...) and impact is defined in speed of the Kibana GUI / how long it takes
to load 7 or 14 days of data. Thats what is important for my colleagues.

Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:

What sort of data are you indexing? When you said performance impact was
minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 4 August 2014 16:43, horst knete <badun...@hotmail.de <javascript:>>
wrote:

Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced
the amount of data from 3,2 TB to 1,1TB without any segnificant performance
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

Hi,

gzip/zlib compression is very bad for performance, so it can be
interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses
4k blocks and you add FS compression, you will probably have a very small
gain, if any. I've tried on ZFS:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem
does show some benefit:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES
read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de wrote:

Hey guys,

we have mounted an btrfs file system with the compression method
"zlib" for
testing purposes on our elasticsearchserver and copied one of the
indices
on the btrfs volume, unfortunately it had no success and still got the
size
of 50gb :confused:

I will further try it with other compression methods and will report
here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but
should

you try it anyway, I'd love to hear your findings/observations.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression
file

system, which allows us transparent compression for the es-indices.
(It

seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like
BTRFS or

ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #11

You are aware of the fact the kind of search performance you mean
depends on RAM and virtual memory organization of the cluster, not on
storage, so "without any siginifcant performace losses" could be expected ?

Jörg

Am 04.08.14 12:41, schrieb horst knete:

We are indexing all sort of events (Windows, Linux, Apache, Netflow
and so on...) and impact is defined in speed of the Kibana GUI / how
long it takes to load 7 or 14 days of data. Thats what is important
for my colleagues.

Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:

What sort of data are you indexing? When you said performance
impact was minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com <http://www.campaignmonitor.com>


On 4 August 2014 16:43, horst knete <badun...@hotmail.de> wrote:

    Hi again,

    a quick report regarding compression:

    we are using a 3-TB btrfs-volume with 32k block size now which
    reduced the amount of data from 3,2 TB to 1,1TB without any
    segnificant performance losses ( we are using a 8 CPU, 20 GB
    Memory machine with an iSCSI.Link to the volume ).

    So for us i can only suggest using the btrfs-volume for long
    term storage.

    Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick
    Proniewski:

        Hi,

        gzip/zlib compression is very bad for performance, so it
        can be interesting for closed indices, but for live data I
        would not recommend it.
        Also, you must know that:

        Compression using lz4 is already enabled into indices,
        ES/Lucene/Java usually read&write 4k blocks,

        -> hence, compression is achieved on 4k blocks. If your
        filesystem uses 4k blocks and you add FS compression, you
        will probably have a very small gain, if any. I've tried
        on ZFS:

        Filesystem             Size    Used   Avail Capacity
         Mounted on
        zdata/ES-lz4           1.1T    1.9G    1.1T 0%  
         /zdata/ES-lz4
        zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES

        If you are using a larger block size, like 128k, a
        compressed filesystem does show some benefit:

        Filesystem             Size    Used   Avail Capacity
         Mounted on
        zdata/ES-lz4           1.1T    1.1G    1.1T 0%  
         /zdata/ES-lz4        -> compressratio  1.73x
        zdata/ES-gzip          1.1T    901M    1.1T 0%  
         /zdata/ES-gzip        -> compressratio  2.27x
        zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES

        But a file system block larger than 4k is very suboptimal
        for IO (ES read or write one 4k block -> your FS must read
        or write a 128k block).

        On 21 juil. 2014, at 07:58, horst knete
        <badun...@hotmail.de> wrote:

        > Hey guys,
        >
        > we have mounted an btrfs file system with the
        compression method "zlib" for
        > testing purposes on our elasticsearchserver and copied
        one of the indices
        > on the btrfs volume, unfortunately it had no success and
        still got the size
        > of 50gb :/
        >
        > I will further try it with other compression methods and
        will report here
        >
        > Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis
        Gospodnetic:
        >>
        >> Hi Horst,
        >>
        >> I wouldn't bother with this for the reasons Joerg
        mentioned, but should
        >> you try it anyway, I'd love to hear your
        findings/observations.
        >>
        >> Otis
        >> --
        >> Performance Monitoring * Log Analytics * Search Analytics
        >> Solr & Elasticsearch Support * http://sematext.com/
        >>
        >>
        >>
        >> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst
        knete wrote:
        >>>
        >>> Hey Guys,
        >>>
        >>> to save a lot of hard disk space, we are going to use
        an compression file
        >>> system, which allows us transparent compression for
        the es-indices. (It
        >>> seems like es-indices are very good compressable, got
        up to 65%
        >>> compression-rate in some tests).
        >>>
        >>> Currently the indices are laying at a ext4-Linux
        Filesystem which
        >>> unfortunately dont have the transparent compression
        ability.
        >>>
        >>> Anyone of you got experience with compression file
        systems like BTRFS or
        >>> ZFS/OpenZFS and can tell us if this led to big
        performance losses?
        >>>
        >>> Thanks for responding

    -- 
    You received this message because you are subscribed to the
    Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from
    it, send an email to elasticsearc...@googlegroups.com.
    To view this discussion on the web visit
    https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer>.
    For more options, visit https://groups.google.com/d/optout
    <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53DF7748.2090308%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Horst Birne) #12

Hi,

i wasn´t really that aware that this could only led to higher usage of CPU
and RAM, but to say so, the cpu load has indeed increased by about 20-30%
compared to not compressing the storage. The RAM usage didnt increase by a
big deal.

IMHO a bit higher CPU-load is definietly worth it, if you could save about
60% of your hard disk space - every financial manager would agree.

Am Montag, 4. August 2014 14:06:47 UTC+2 schrieb Jörg Prante:

You are aware of the fact the kind of search performance you mean
depends on RAM and virtual memory organization of the cluster, not on
storage, so "without any siginifcant performace losses" could be expected
?

Jörg

Am 04.08.14 12:41, schrieb horst knete:

We are indexing all sort of events (Windows, Linux, Apache, Netflow
and so on...) and impact is defined in speed of the Kibana GUI / how
long it takes to load 7 or 14 days of data. Thats what is important
for my colleagues.

Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:

What sort of data are you indexing? When you said performance 
impact was minimal, how minimal and at what points are you seeing 

it?

Regards, 
Mark Walkom 

Infrastructure Engineer 
Campaign Monitor 
email: ma...@campaignmonitor.com 
web: www.campaignmonitor.com <http://www.campaignmonitor.com> 


On 4 August 2014 16:43, horst knete <badun...@hotmail.de> wrote: 

    Hi again, 

    a quick report regarding compression: 

    we are using a 3-TB btrfs-volume with 32k block size now which 
    reduced the amount of data from 3,2 TB to 1,1TB without any 
    segnificant performance losses ( we are using a 8 CPU, 20 GB 
    Memory machine with an iSCSI.Link to the volume ). 

    So for us i can only suggest using the btrfs-volume for long 
    term storage. 

    Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick 
    Proniewski: 

        Hi, 

        gzip/zlib compression is very bad for performance, so it 
        can be interesting for closed indices, but for live data I 
        would not recommend it. 
        Also, you must know that: 

        Compression using lz4 is already enabled into indices, 
        ES/Lucene/Java usually read&write 4k blocks, 

        -> hence, compression is achieved on 4k blocks. If your 
        filesystem uses 4k blocks and you add FS compression, you 
        will probably have a very small gain, if any. I've tried 
        on ZFS: 

        Filesystem             Size    Used   Avail Capacity 
         Mounted on 
        zdata/ES-lz4           1.1T    1.9G    1.1T 0%   
         /zdata/ES-lz4 
        zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES 

        If you are using a larger block size, like 128k, a 
        compressed filesystem does show some benefit: 

        Filesystem             Size    Used   Avail Capacity 
         Mounted on 
        zdata/ES-lz4           1.1T    1.1G    1.1T 0%   
         /zdata/ES-lz4        -> compressratio  1.73x 
        zdata/ES-gzip          1.1T    901M    1.1T 0%   
         /zdata/ES-gzip        -> compressratio  2.27x 
        zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES 

        But a file system block larger than 4k is very suboptimal 
        for IO (ES read or write one 4k block -> your FS must read 
        or write a 128k block). 

        On 21 juil. 2014, at 07:58, horst knete 
        <badun...@hotmail.de> wrote: 

        > Hey guys, 
        > 
        > we have mounted an btrfs file system with the 
        compression method "zlib" for 
        > testing purposes on our elasticsearchserver and copied 
        one of the indices 
        > on the btrfs volume, unfortunately it had no success and 
        still got the size 
        > of 50gb :/ 
        > 
        > I will further try it with other compression methods and 
        will report here 
        > 
        > Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis 
        Gospodnetic: 
        >> 
        >> Hi Horst, 
        >> 
        >> I wouldn't bother with this for the reasons Joerg 
        mentioned, but should 
        >> you try it anyway, I'd love to hear your 
        findings/observations. 
        >> 
        >> Otis 
        >> -- 
        >> Performance Monitoring * Log Analytics * Search Analytics 
        >> Solr & Elasticsearch Support * http://sematext.com/ 
        >> 
        >> 
        >> 
        >> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst 
        knete wrote: 
        >>> 
        >>> Hey Guys, 
        >>> 
        >>> to save a lot of hard disk space, we are going to use 
        an compression file 
        >>> system, which allows us transparent compression for 
        the es-indices. (It 
        >>> seems like es-indices are very good compressable, got 
        up to 65% 
        >>> compression-rate in some tests). 
        >>> 
        >>> Currently the indices are laying at a ext4-Linux 
        Filesystem which 
        >>> unfortunately dont have the transparent compression 
        ability. 
        >>> 
        >>> Anyone of you got experience with compression file 
        systems like BTRFS or 
        >>> ZFS/OpenZFS and can tell us if this led to big 
        performance losses? 
        >>> 
        >>> Thanks for responding 

    -- 
    You received this message because you are subscribed to the 
    Google Groups "elasticsearch" group. 
    To unsubscribe from this group and stop receiving emails from 
    it, send an email to elasticsearc...@googlegroups.com. 
    To view this discussion on the web visit 

https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com

    <

https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer>.

    For more options, visit https://groups.google.com/d/optout 
    <https://groups.google.com/d/optout>. 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/10eb25ae-82c5-4707-9f93-abe99b301840%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #13