A question on the spooling to disk

sentient · September 18, 2018, 8:02pm

When events cannot be send to Elastic Search, events are nicely written to disk.
On restart the events are send to ElasticSearch.

I was wondering, do we always need to restart? Or can we set an option to retry to submit after XX time period? Or when other events are again successfully submitted, can we resubmit the pending events.

jsoriano · September 19, 2018, 10:05am

Hi @sentient,

Do you mean restarting filebeat? There is no need to restart it. Filebeat keeps retrying to connect till Elasticsearch is available again, when it is, and if queue spool is being used, pending events are submitted. This reconnection is not inmediate, but it should be a matter of seconds to happen. You can see reconnection attempts in the logs.

Are you observing a different behaviour?

sentient · September 19, 2018, 2:35pm

Jaime,

Thanks for your reply. I am actually developing a custom beat to send Statsd data into Elastic search (https://github.com/sentient/statsdbeat). Not complete yet, just under development....

Initially I was writing my own disk storage but then I found out about the spooling functionality (just what I was looking for).

So if you write your own beat and use the spooling queue, how do I control the retry functionality.
I did notice in the logs things about a 'retryer' sending signals. Do I have to act on this (or is this something completely different)

2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:172   retryer: send unwait-signal to consumer                                                          
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:174     done                                                                                           
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:149   retryer: send wait signal to consumer                                                            
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:151     done

jsoriano · September 19, 2018, 4:47pm

Oh, this is nice, I was assuming that you were trying it with filebeat, not your own beat

To use the disk queue it should be enough with adding queue.spool: ~ to your configuration. But retry logic is part of the output and this is controlled there, not in the queue. You can find for example some parameters for that in the Elasticsearch output, like max_retries, backoff.init or backoff.max.

By the way, have you considered to add a statsd metricset to metricbeat instead of implementing a whole beat? There is for example a module for graphite.

sentient · September 19, 2018, 6:09pm

Thanks again Jaime,

Great tip for the metricbeat module. I will a look into that.

system · October 17, 2018, 6:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Questions about retry and store data in a file Beats metricbeat	8	1778	November 20, 2017
Filebeat recovery from ES disconnect Beats	2	602	November 15, 2017
Filebeat stops processing and does not retry Beats filebeat	3	1716	September 4, 2017
Filebeat temporarily unable to connect to ES Beats filebeat	4	346	March 1, 2019
Custom Beat - spool disk queue Beats	3	986	June 11, 2019

A question on the spooling to disk

Related topics