I'm just wondering if there is functionality for staggering all of the ping requests within a heartbeat monitor so that they do not occur at the same time.
i.e. would it be possible to arrange 1000 or so requests to be executed within a 1 minute period with an even distribution across that time range. Ideally this would be within a single monitor; I understand that this could be achieved by splitting all of those 1000 hosts into multiple monitors which all work on separate schedules. However, to spread the requests fairly evenly over 60 seconds I would require numerous monitors, which isn't really ideal.
Currently all 1000 of those requests are completed in under a second, with many of them returning a "No buffer space available" error, which I'm assuming is caused by some sort of throttling on the NIC.
The monitors and already split into IO tasks, with IP lookup via DNS and actual ping being extra IO tasks. Currently one can use heartbeat.scheduler.limit, to limit the number of concurrent active IO tasks. But this is often not good enough. Rate limiting (spreading the tasks in time) is a mandatory item in the Heartbeat GA roadmap.
Thanks for your reply Steffens, good to see this item is on the roadmap. I ended up fixing my issue by increasing the max socket buffer size on my system in order to handle all requests going out at once.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.