Large translog files using 6.0 stack

Hi,

I switched to using the new 6.0 ELK stack this morning and when loading data (I started from a brand new instance, rather than trying to migrate) I noticed that I was running out of disk space on my server running ElasticSearch.

After doing some digging around for a while the problem appears to be that translog files are not being automatically cleaned up when I use the v6.0 stack but instead stay present within the directory which holds the index.

As a comparison, I loaded 5 days worth of records from my system into a 5.6.3 ELK environment and then did the same for a 6.0 ELK environmment.

Within the 5.6.3 environment, whilst Logstash was feeding records into Elasticsearch the translog files were pretty large. However, a period of time after Logstash finished (not sure what it was - 60 seconds maybe) these translog files disappeared, and the total storage used by Elasticsearch pretty much halved.

The screenshot below shows the folder size mapping for my Elasticsearch 5.6.3 area after this translog removal occured:

Compare this to the Elasticsearch 6.0 area after the same amount of time following Logstash completion (actually I waited a good 10-15 minutes and nothing changed):

In both 5.6.3 and 6.0 the actual set of records created is the same (which is as expected, given I've loaded the same data to each):

5.6.3:

6.0:

Has anyone else encountered this issue? At the moment I'll need to hold off migrating fully to the 6.0 stack as I'd just run out of disk storage under normal operation.

Cheers,
Steve

This is likely doe to the new sequence numbers introduced in Elasticsearch 6.0. These will make recovery faster, but you can tune how much is stored.

Hi Christian,

So in 5.6.3 these files used to disappear entirely after a certain period. I'm fine with the same behaviour in 6.0. Any idea how I configure 6.x to work this way?

Or alternatively (and maybe better, to take advantage of the new functionality), how do I tune this so that it doesn't use as much storage (at the moment it's using pretty much the same amount as my data).

Cheers (and thanks for the quick reply),
Steve

There are two settings mentioned in the blog post that you can tune down, although it is likely to make recovery slower. Whether this matters will depend on your cluster size and data volumes.

Hi - I forgot to add that I'm working on only a single node (this is not a PROD environment). Given the blog post talks about synchronization across multiple nodes (which I don't have) is this perhaps the reason why these files never reduce?

...I'll have a proper read of the blog post now but just wanted to make the above clear.

Steve

For a single node cluster they will not be very useful.

No...that's what I thought... :slight_smile:

OK, I'll tweak the settings for my current installation. The blog post is useful - and mentions the issue of siginficantly increased disk space - thanks for pointing me to it.

One last question. Are index.translog.retention.size andindex.translog.retention.age settable in the elasticsearch.yml or only via the console? I wasn't clear from reading the specific Translog page (https://www.elastic.co/guide/en/elasticsearch/reference/6.0/index-modules-translog.html#_translog_settings).

Apologies if this is a dumb question - I'm still relatively new to ELK.

Regards,
Steve

Hi,

I only seem to be able to set these parameters on existing indexes, whereas I'd like to configure Elasticsearch so that when it creates a new index it automatically has the parameters applied as below:

index.translog.retention.size = 10mb
index.translog.retention.age = 10s

Cheers,
Steve

You should be able to use an index template.

Hi Christian,

Thanks for your help - I've set up and applied a template and it seems to be working fine. For info, I used the following template:

PUT _template/translog_settings_single_node
{
  "index_patterns": ["logstash*"],
  "settings": {
    "number_of_shards": 1,
    "index.translog.retention.size": "20mb",
    "index.translog.retention.age" : "10s",
    "index.translog.flush_threshold_size" : "20mb"
  }
}

Regards,
Steve

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.