Elasticsearch Compression ratio

hossein_ey · July 17, 2017, 3:48pm

we want to use Elasticsearch for a large amount of data.
one of the important issue is storage usage.

we create a sample index with 412 million

412 million rows take 242 GB of hard ===> 590 Bytes for each row

we know each row of our data with json format has 800-1000 Bytes size

so elastic compressed our 900 Bytes data into 580 Bytes ...

is there any better way to compress our data?

Christian_Dahlqvist · July 17, 2017, 3:53pm

There are some guidelines here. What does your mapping look like?

hossein_ey · July 17, 2017, 4:22pm

most fields are integer...
we have 50-60 fields for each document and most of them should be searchable( exact search and range search )

Christian_Dahlqvist · July 17, 2017, 4:31pm

Are you using the best_compression codec? Do you have the _all field enabled? If so, do you need it?

hossein_ey · July 18, 2017, 6:49am

we use default settings of Logastash template And Elastic ...
the compression is default( I think lz4) and _all filed is enabled.
we don't need _all field and we should disable it ...
but for best_compression, does it impact the indexing performance?

Christian_Dahlqvist · July 18, 2017, 6:59am

Using best_compression does have some impact on indexing performance, but does compress the source a lot better and can save a significant amount of disk space. Disabling the _all field will also save space if you do not need it. If you have fields that do not need to be aggregated upon or be subject to free-text search, you can also slimline the default Logstash mappings and not have all fields dual-mapped. I discussed these topics and the trade-offs in blog post.

system · August 15, 2017, 6:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Perform compression in Elasticsearch in a good ratio Elasticsearch	2	399	June 6, 2018
Stored logs and compression Elasticsearch	3	3015	October 19, 2017
Elastic Search Index Data Compression (v1.4.2) Elasticsearch	11	4222	July 6, 2017
Is there any drawback of using best_compression while indexing in Elasticsearch? Elasticsearch	2	6068	December 30, 2016
Elasticsearch compresson Elasticsearch	2	430	May 26, 2017

Elasticsearch Compression ratio

Related topics