I've successfully tested getting system load/RAM/etc statistics as described in the CollectD codec docs. That said, I haven't been able to figure out how to avoid a loss of data from my nodes when/if the Logstash indexer goes down.
For my other logs, I use Logstash-Forwarder, HAProxy, and two Logstash indexer instances. That, as far as I've been able to tell, avoids all (or at least most) data loss when the indexers are restarted or go down for some reason.
CollectD on the other hand, sends using UDP. So I can't load balance with HAProxy. I could list both indexers, but then I end up with duplicate data. CollectD doesn't, as far as my reading of the manual goes, seem to support sending to one server, and failing over to another. Which makes sense since UDP doesn't care if the data is actually received...
I don't see any means to tell CollectD to use TCP.
So, any suggestions? A better ELK set up? Alternatives to CollectD? Etc.
Note, I think this will be solved in Logstash 2. Since Logstash 2 implements better queues, and hopefully has a smaller memory footprint. Those two things will let me ditch LSF, and then just tell CollectD to send to the local Logstash instance.