Does Elasticsearch automatically compress data?

2 days ago I imported 144 MB of log to see how much space is needed for them in Elasticsearch.
The result was:

/var/lib/elasticsearch# du -ch nodes/ 
...
974M    nodes/0
974M    nodes/
974M    total

Today I checked that directory again, and now it has a much lower size:

/var/lib/elasticsearch# du -ch nodes/
...
316M    nodes/0
316M    nodes/
316M    total

The biggest difference is:

< 129M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/3/translog
< 190M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/3
---
> 8.0K    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/3/translog
> 62M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/3
108,110c108,110
< 136M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/1/translog
< 197M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/1
< 61M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2/index
---
> 8.0K    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/1/translog
> 63M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/1
> 62M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2/index
112,114c112,114
< 136M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2/translog
< 197M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2
< 62M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0/index
---
> 8.0K    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2/translog
> 62M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/2
> 63M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0/index
116,117c116,117
< 129M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0/translog
< 190M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0
---
> 8.0K    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0/translog
> 63M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/0
119c119
< 62M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4/index
---
> 63M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4/index
121,127c121,127
< 136M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4/translog
< 197M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4
< 970M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A
< 974M    nodes/0/indices
< 974M    nodes/0
< 974M    nodes/
< 974M    total
\ No newline at end of file
---
> 8.0K    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4/translog
> 63M     nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A/4
> 310M    nodes/0/indices/nEHLUvAWQhi6IMQtqiTM9A
> 316M    nodes/0/indices
> 316M    nodes/0
> 316M    nodes/
> 316M    total
\ No newline at end of file

Can someone explain please what is happening?
I'd like to see some exact numbers so that I could order the production server. :slight_smile:

I'm using Elasticsearch 5.3.

Does Elasticsearch automatically compress? Yes. The default compression is LZ4, but you can use DEFLATE, which is higher compression at the cost of more CPU to compress further, and a slightly slower indexing rate.

With regards to space, the first big thing I notice is that the translogs seem to be clearing themselves out—just like they're supposed to. The translog (transaction log) keeps a running record of things being indexed. Those entries are purged out of the translog after they have been verified to have been added to Elasticsearch. This process is transparent to the user, and runs continuously, so long as new data is being added to Elasticsearch.

For your purposes, consider the disk space needed as something in the vein of FIFO space for things coming in, that will empty as soon as it can. You may need to plan for that space to balloon and shrink as noted above.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.