A question on the spooling to disk

(Sentient)

When events cannot be send to Elastic Search, events are nicely written to disk.
On restart the events are send to ElasticSearch.

I was wondering, do we always need to restart? Or can we set an option to retry to submit after XX time period? Or when other events are again successfully submitted, can we resubmit the pending events.

(Jaime Soriano)

Hi @sentient,

Do you mean restarting filebeat? There is no need to restart it. Filebeat keeps retrying to connect till Elasticsearch is available again, when it is, and if queue spool is being used, pending events are submitted. This reconnection is not inmediate, but it should be a matter of seconds to happen. You can see reconnection attempts in the logs.

Are you observing a different behaviour?

(Sentient)


Thanks for your reply. I am actually developing a custom beat to send Statsd data into Elastic search (https://github.com/sentient/statsdbeat). Not complete yet, just under development....

Initially I was writing my own disk storage but then I found out about the spooling functionality (just what I was looking for).

So if you write your own beat and use the spooling queue, how do I control the retry functionality.
I did notice in the logs things about a 'retryer' sending signals. Do I have to act on this (or is this something completely different)

2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:172   retryer: send unwait-signal to consumer                                                          
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:174     done                                                                                           
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:149   retryer: send wait signal to consumer                                                            
2018-09-18T16:48:29.784-0700    INFO    [publish]       pipeline/retry.go:151     done    

(Jaime Soriano)

Oh, this is nice, I was assuming that you were trying it with filebeat, not your own beat :slight_smile:

To use the disk queue it should be enough with adding queue.spool: ~ to your configuration. But retry logic is part of the output and this is controlled there, not in the queue. You can find for example some parameters for that in the Elasticsearch output, like max_retries, backoff.init or backoff.max.

By the way, have you considered to add a statsd metricset to metricbeat instead of implementing a whole beat? There is for example a module for graphite.

(Sentient)

Thanks again Jaime,

Great tip for the metricbeat module. I will a look into that.

(system)

