When events cannot be send to Elastic Search, events are nicely written to disk.
On restart the events are send to ElasticSearch.
I was wondering, do we always need to restart? Or can we set an option to retry to submit after XX time period? Or when other events are again successfully submitted, can we resubmit the pending events.
Do you mean restarting filebeat? There is no need to restart it. Filebeat keeps retrying to connect till Elasticsearch is available again, when it is, and if queue spool is being used, pending events are submitted. This reconnection is not inmediate, but it should be a matter of seconds to happen. You can see reconnection attempts in the logs.
Thanks for your reply. I am actually developing a custom beat to send Statsd data into Elastic search (https://github.com/sentient/statsdbeat). Not complete yet, just under development....
Initially I was writing my own disk storage but then I found out about the spooling functionality (just what I was looking for).
So if you write your own beat and use the spooling queue, how do I control the retry functionality.
I did notice in the logs things about a 'retryer' sending signals. Do I have to act on this (or is this something completely different)
2018-09-18T16:48:29.784-0700 INFO [publish] pipeline/retry.go:172 retryer: send unwait-signal to consumer
2018-09-18T16:48:29.784-0700 INFO [publish] pipeline/retry.go:174 done
2018-09-18T16:48:29.784-0700 INFO [publish] pipeline/retry.go:149 retryer: send wait signal to consumer
2018-09-18T16:48:29.784-0700 INFO [publish] pipeline/retry.go:151 done
Oh, this is nice, I was assuming that you were trying it with filebeat, not your own beat
To use the disk queue it should be enough with adding queue.spool: ~ to your configuration. But retry logic is part of the output and this is controlled there, not in the queue. You can find for example some parameters for that in the Elasticsearch output, like max_retries, backoff.init or backoff.max.
By the way, have you considered to add a statsd metricset to metricbeat instead of implementing a whole beat? There is for example a module for graphite.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.