Is it safe to use filebeat.registry_flush in the filebeat 6.21

RishiSingh · March 1, 2018, 5:49am

Hi,

I am making use of filebeat 6.21, there are some cases when the include pattern does not match any line in the log file and hence the registry file is being updated frequently.
By looking into the code I came across

filebeat.registry_flush

As this setting is not specified in the documentation. So I wonder if its safe to use this setting, or its only for the experimental purpose for now.

I have set this setting to 100s, and it looks fine for me, but still it may be possible that I might have missed some test cases where this setting would have resulted into some unrecoverable state.

Thanks

ruflin · March 4, 2018, 12:10am

I think it's fine to be used. Would be good to also get @steffens take on this.

@steffens Should we document this?

steffens · March 12, 2018, 5:00pm

The registry_flush dictates when and how often the registry is serialized to disk. All state changes will be buffered in main memory, before the flush happens. The way the registry works right now is, it keeps all state in memory and the registry file update is basically a snapshot of the current state. With registry_flush: 0 (the default), each ACKed batch of events will trigger a snapshot.

State updates do include file renames and offsets of the last send events. If the state is not flushed yet, but filebeat is restarted, filebeat will have to send already published events again. Filebeat flushes the registry on normal shutdown, but if the machine, or filebeat crashes, or if filebeat is forced to be shutdown, then the final registry flush is missing. This leads to duplicates. As some events can be in the pipeline (not yet being ACKed), also use shutdown_timeout, to reduce the chance of duplicates.

There is no 'perfect' value for registry_flush. It's more of a trade-off between chance of duplicates on crashes and overall disk IO. It's some 'risk' you will have to take as user. The number of duplicates you might experience depends on the event rate and the registry_flush. Roughly estimated to be avg eps * registry_flush.

A missing flush will have filebeat to restart with some old state. But on startup, filebeat resyncs the in-memory registry state, so to continue processing from 'old' offsets.

The setting not being documented is a bug. Please open an issue here. Thanks.

system · April 9, 2018, 5:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Does Filebeat registry file get deleted during reboot Beats filebeat	4	1176	March 23, 2018
Filebeat on Windows seem to not use the registry file Beats filebeat	13	3988	April 17, 2017
Deleting Filebeat Registry File Beats filebeat	7	25581	July 5, 2017
Filebeat Registry - Will I get duplicates if I delete Beats filebeat	3	1839	May 21, 2020
Deleting the registry file Beats	2	6151	December 7, 2016

Is it safe to use filebeat.registry_flush in the filebeat 6.21

Related topics