Compresstion in ES 1.2.1


(sri) #1

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be enabled
in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using
the mapping api
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html,
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #2

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr.204@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using the mapping api, but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/06202D90-95A9-4998-AC18-7ECFC38CE336%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(sri) #3

Hi David,

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2
Input format : rsyslog

input file size(Mb) ES file size per node(Mb) 1 1.8 2 3.6 3 5.3 4 6.8
5 8.5 6 10.1 7 11.7 8 13 9 14.1 10 16
I am sorry to ask like this, but i wasn't understanding how the compression
was taking place.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com <javascript:>> a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using
the mapping api
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html,
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #4

Well. Think that you index all field individualy, that you are storing source (compressed) and that you are indexing _all field as well.

So with defaults, this results make sense to me.

Try disable _all field and see what gain you can get.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:50, sri 1.fr.204@gmail.com a écrit :

Hi David,

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2
Input format : rsyslog

input file size(Mb) ES file size per node(Mb)
1 1.8
2 3.6
3 5.3
4 6.8
5 8.5
6 10.1
7 11.7
8 13
9 14.1
10 16

I am sorry to ask like this, but i wasn't understanding how the compression was taking place.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:
It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using the mapping api, but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E4936BA4-307A-4B3C-A41D-B6889C0A5ECA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #5

Compression is always enabled by default.

Jörg

On Sun, Jun 8, 2014 at 6:01 PM, sri 1.fr.204@gmail.com wrote:

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using
the mapping api
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html,
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEZg-qEbYeRER8%3D4RY75ExPo1fVaU_ZM1v3SKmSkG2cHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(sri) #6

Okay i will make the changes and upload the new stats.

I am just curious, could you explain how the results were making sense, i
just want to get a proper idea of what ES is actually doing to the data.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:56:55 PM UTC-4, David Pilato wrote:

Well. Think that you index all field individualy, that you are storing
source (compressed) and that you are indexing _all field as well.

So with defaults, this results make sense to me.

Try disable _all field and see what gain you can get.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:50, sri <1.fr...@gmail.com <javascript:>> a écrit :

Hi David,

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2
Input format : rsyslog

input file size(Mb) ES file size per node(Mb) 1 1.8 2 3.6 3 5.3 4
6.8 5 8.5 6 10.1 7 11.7 8 13 9 14.1 10 16
I am sorry to ask like this, but i wasn't understanding how the
compression was taking place.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using
the mapping api
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html,
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b2cac83a-777a-4876-bf07-5cf093a92c1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #7

The Elasticsearch file size does not only contain compressed fields, but
much more. For example, term vectors, norms, etc. You would have to disable
field attributes you do not want. Also note, Elasticsearch has replica
enabled by default, and segment count is not optimized automatically.

Jörg

On Sun, Jun 8, 2014 at 7:09 PM, sri 1.fr.204@gmail.com wrote:

Okay i will make the changes and upload the new stats.

I am just curious, could you explain how the results were making sense, i
just want to get a proper idea of what ES is actually doing to the data.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:56:55 PM UTC-4, David Pilato wrote:

Well. Think that you index all field individualy, that you are storing
source (compressed) and that you are indexing _all field as well.

So with defaults, this results make sense to me.

Try disable _all field and see what gain you can get.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:50, sri 1.fr...@gmail.com a écrit :

Hi David,

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2
Input format : rsyslog

input file size(Mb) ES file size per node(Mb) 1 1.8 2 3.6 3 5.3 4
6.8 5 8.5 6 10.1 7 11.7 8 13 9 14.1 10 16
I am sorry to ask like this, but i wasn't understanding how the
compression was taking place.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html,
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b2cac83a-777a-4876-bf07-5cf093a92c1c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b2cac83a-777a-4876-bf07-5cf093a92c1c%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG5TAS08k2Wtqe647reMKHUkNkvyepfnp7Sz7u9YqyDag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Patrick Proniewski) #8

Hello,

I don't know how it's compressed but it appears that data is compressed up to an amount of 4k. ie. it's useless to store data on a compressed (lz4) filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0% /zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0% /zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal for IO (unless you have a big amount of physical memory you can dedicate to filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr.204@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using the mapping api, but what i was interested is the compression of data storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74DEB7BF-4ED9-4E27-85E6-7775D9DD586E%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #9

Lucene uses LZ4 compression

so you should not run ES on a ZFS file system with compression enabled.

Jörg

On Sun, Jun 8, 2014 at 8:47 PM, Patrick Proniewski <elasticsearch@patpro.net

wrote:

Hello,

I don't know how it's compressed but it appears that data is compressed up
to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a
huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0% /zdata/ES-lz4 ->
compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0% /zdata/ES-gzip ->
compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal for
IO (unless you have a big amount of physical memory you can dedicate to
filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr.204@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api, but what i was interested is the compression of data
storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/74DEB7BF-4ED9-4E27-85E6-7775D9DD586E%40patpro.net
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3DRbSDop-yA%3D7h8WnLu78OYAi-yfMYGnaqDyvVnxp1vw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(sri) #10

Thanks a lot for the insight Patrick.

I have a few more queries:

  • it is possible to disable the '_source' and '_all' fields by default
    for all the indices that would be created later (possibility define in the
    elasticsearch.yml file)
  • what happens if my index is created and then i disable '_source' and
    '_all' fields, would that effect the file size of the index, i.e., will the
    fields be removed/disabled for only the documents that will be added after
    the disabling the fields??

Thanks and Regards
Sri

On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:

Hello,

I don't know how it's compressed but it appears that data is compressed up
to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a
huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal for
IO (unless you have a big amount of physical memory you can dedicate to
filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com <javascript:>> a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api, but what i was interested is the compression of data
storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #11

Try this index template for new index creations

curl -XPUT 'localhost:9200/_template/template1' -d '
{
"template" : "*",
"mappings" : {
"default" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : false}
}
}
}
'

See also

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html

You can not disable _all or _source in an existing index.

Jörg

On Sun, Jun 8, 2014 at 10:22 PM, sri 1.fr.204@gmail.com wrote:

Thanks a lot for the insight Patrick.

I have a few more queries:

  • it is possible to disable the '_source' and '_all' fields by default
    for all the indices that would be created later (possibility define in the
    elasticsearch.yml file)
  • what happens if my index is created and then i disable '_source' and
    '_all' fields, would that effect the file size of the index, i.e., will the
    fields be removed/disabled for only the documents that will be added after
    the disabling the fields??

Thanks and Regards
Sri

On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:

Hello,

I don't know how it's compressed but it appears that data is compressed
up to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a
huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal for
IO (unless you have a big amount of physical memory you can dedicate to
filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api, but what i was interested is the compression of data
storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi6Lj447vhx1eCsZ%3D7CcWf79pY%2B-b%2BauKbf5ggA1cpEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(sri) #12

Hello Jorg,

Thanks a lot for the info., i tried applying the template provided by you
but the size is not reducing.On the other hand, I was noticing decrease in
size when i was disabling the fields via Mapping API.

Thanks and Regards
Sri

On Sunday, June 8, 2014 4:37:58 PM UTC-4, Jörg Prante wrote:

Try this index template for new index creations

curl -XPUT 'localhost:9200/_template/template1' -d '
{
"template" : "*",
"mappings" : {
"default" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : false}
}
}
}
'

See also

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html

You can not disable _all or _source in an existing index.

Jörg

On Sun, Jun 8, 2014 at 10:22 PM, sri <1.fr...@gmail.com <javascript:>>
wrote:

Thanks a lot for the insight Patrick.

I have a few more queries:

  • it is possible to disable the '_source' and '_all' fields by
    default for all the indices that would be created later (possibility define
    in the elasticsearch.yml file)
  • what happens if my index is created and then i disable '_source'
    and '_all' fields, would that effect the file size of the index, i.e., will
    the fields be removed/disabled for only the documents that will be added
    after the disabling the fields??

Thanks and Regards
Sri

On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:

Hello,

I don't know how it's compressed but it appears that data is compressed
up to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a
huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal
for IO (unless you have a big amount of physical memory you can dedicate to
filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to enable
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api, but what i was interested is the compression of data
storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a61f2eda-9c6e-4981-bde1-15d18bff5fd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(sri) #13

Hello Jorg,

I am sorry, there was some problem in the implementation at my end. Thanks
a lot guys for the insight and help.
Appreciate the quick responses.

Thanks and Regards
Sri

On Sunday, June 8, 2014 5:04:24 PM UTC-4, sri wrote:

Hello Jorg,

Thanks a lot for the info., i tried applying the template provided by you
but the size is not reducing.On the other hand, I was noticing decrease in
size when i was disabling the fields via Mapping API.

Thanks and Regards
Sri

On Sunday, June 8, 2014 4:37:58 PM UTC-4, Jörg Prante wrote:

Try this index template for new index creations

curl -XPUT 'localhost:9200/_template/template1' -d '
{
"template" : "*",
"mappings" : {
"default" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : false}
}
}
}
'

See also

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html

You can not disable _all or _source in an existing index.

Jörg

On Sun, Jun 8, 2014 at 10:22 PM, sri 1.fr...@gmail.com wrote:

Thanks a lot for the insight Patrick.

I have a few more queries:

  • it is possible to disable the '_source' and '_all' fields by
    default for all the indices that would be created later (possibility define
    in the elasticsearch.yml file)
  • what happens if my index is created and then i disable '_source'
    and '_all' fields, would that effect the file size of the index, i.e., will
    the fields be removed/disabled for only the documents that will be added
    after the disabling the fields??

Thanks and Regards
Sri

On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:

Hello,

I don't know how it's compressed but it appears that data is compressed
up to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
filesystem if fs block size is 4k:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.9G 1.1T 0% /zdata/ES-lz4
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a
huge win:

Filesystem Size Used Avail Capacity Mounted on
zdata/ES-lz4 1.1T 1.1G 1.1T 0%
/zdata/ES-lz4 -> compressratio 1.73x
zdata/ES-gzip 1.1T 901M 1.1T 0%
/zdata/ES-gzip -> compressratio 2.27x
zdata/ES 1.1T 1.9G 1.1T 0% /zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal
for IO (unless you have a big amount of physical memory you can dedicate to
filesystem data cache, which would be redundant with ES cache).

On 08 juin 2014, at 18:41, David Pilato wrote:

It's compressed by default now.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 juin 2014 à 18:01, sri 1.fr...@gmail.com a écrit :

Hello everyone,

I have read posts and blogs on how elasticsearch compression can be
enabled in the previous versions(0.17 - 0.19).

I am currently using ES 1.2.1, i wasn't able to find out how to
enable compression in this version or if at all there is any such option
for it.

I know that i can reduce the storage amount by disabling the source
using the mapping api, but what i was interested is the compression of data
storage.

Thanks and Regards
Sri

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/acc298a6-bae1-4bb1-ab1c-24ae28a54ff1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #14