How to avoid data loss when using CollectD?

jerrac · July 8, 2015, 4:25pm

I've successfully tested getting system load/RAM/etc statistics as described in the CollectD codec docs. That said, I haven't been able to figure out how to avoid a loss of data from my nodes when/if the Logstash indexer goes down.

For my other logs, I use Logstash-Forwarder, HAProxy, and two Logstash indexer instances. That, as far as I've been able to tell, avoids all (or at least most) data loss when the indexers are restarted or go down for some reason.

CollectD on the other hand, sends using UDP. So I can't load balance with HAProxy. I could list both indexers, but then I end up with duplicate data. CollectD doesn't, as far as my reading of the manual goes, seem to support sending to one server, and failing over to another. Which makes sense since UDP doesn't care if the data is actually received...

I don't see any means to tell CollectD to use TCP.

So, any suggestions? A better ELK set up? Alternatives to CollectD? Etc.

Note, I think this will be solved in Logstash 2. Since Logstash 2 implements better queues, and hopefully has a smaller memory footprint. Those two things will let me ditch LSF, and then just tell CollectD to send to the local Logstash instance.

theuntergeek · July 8, 2015, 4:40pm

Your best bet for now would be to use Logstash to send the collectd data out to Redis, RabbitMQ, Kafka, or even file, and then ingest from those sources via your HA setup. Indeed, Logstash 2.0 should fix the persistence issue.

jerrac · July 8, 2015, 5:06pm

Hrm... Yeah, I thought of that. But the whole point of using LSF is to avoid having to install java on all my nodes, as well as Logstash's current memory footprint. The ssl communication is also nice. (Or does Logstash support ssl, and I just forgot?)

I'll get over having to install java, but I really should be using less ram than I already am. Our VMWare cluster is way too close to it's limit as is...

Would one of the new "beats" help with any of this, eventually?

Maybe I'll just leave Observium running until Logstash 2 solves the data loss issue. The only thing it doesn't give me is integration with Kibana dashboards.

Topic		Replies	Views
I'm loosing data in my ELK stack Logstash	4	665	July 6, 2017
How to prevent logs missing Elasticsearch	7	980	March 27, 2018
Logstash, collectd and avoiding duplicated data in the ES Logstash	1	697	July 6, 2017
Logstash data Loss Logstash	4	1856	July 6, 2017
Event loss on way rsyslog-> LS -> ES Elasticsearch	24	1781	July 5, 2017

How to avoid data loss when using CollectD?

Related topics