Heartbeat is crashing due to High Availability Cluster Communication

iamyogesh · May 17, 2019, 11:55am

For confirmed bugs, please report:

Version: heartbeat 5.4.3, elasticsearch 4 node cluster with version 6.3.2
Operating System: heartbeat installed on ubuntu 18.04, elasticsearch cluster centos 7
Discuss Forum URL:

heartbeat in runing in ubuntu machine to check the ip status using following configuration

Configure monitors

heartbeat.monitors:

type: icmp

Configure task schedule

schedule: '*/10 * * * *'
hosts: ["192.168.11.76"]

pushing data to elastic search , it crashes after 2 days on continues run of heartbeat service

Steps to Reproduce:
systemctl status heartbeat

● heartbeat.service - Heartbeat High Availability Cluster Communication and Membership
Loaded: loaded (/lib/systemd/system/heartbeat.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2019-05-16 10:15:10 IST; 1h 53min ago
Process: 929 ExecStart=/usr/lib/heartbeat/heartbeat -f (code=exited, status=6)
Main PID: 929 (code=exited, status=6)

May 16 10:15:10 adumaster systemd[1]: Started Heartbeat High Availability Cluster Communication and Memb
May 16 10:15:10 adumaster heartbeat[929]: May 16 10:15:10 adumaster heartbeat: [929]: ERROR: Cannot open
May 16 10:15:10 adumaster heartbeat[929]: May 16 10:15:10 adumaster heartbeat: [929]: info: An annotated
May 16 10:15:10 adumaster heartbeat[929]: May 16 10:15:10 adumaster heartbeat: [929]: info: Please copy
May 16 10:15:10 adumaster heartbeat[929]: May 16 10:15:10 adumaster heartbeat: [929]: ERROR: Heartbeat n
May 16 10:15:10 adumaster heartbeat[929]: May 16 10:15:10 adumaster heartbeat: [929]: ERROR: Configurati
May 16 10:15:10 adumaster systemd[1]: heartbeat.service: Main process exited, code=exited, status=6/NOTC
May 16 10:15:10 adumaster systemd[1]: heartbeat.service: Failed with result 'exit-code'.

check the error log message
/var/log/heartbeat/heartbeat

019-05-17T14:21:22+05:30 INFO No non-zero metrics in the last 30s
2019-05-17T14:21:30+05:30 ERR Failed to perform any bulk index operations: Post http://ip:port/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-05-17T14:21:30+05:30 INFO Error publishing events (retrying): Post http://ip:port/_bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2019-05-17T14:21:31+05:30 ERR Connecting error publishing events (retrying): Get http://ip:port: EOF

martinr_ubi · May 17, 2019, 1:19pm

It’s heartbeat-elastic not heartbeat for the service name. your systemctl command is for another software called heartbeat that has nothing to do with Elastic.

iamyogesh · May 20, 2019, 6:53am

for hearneat 5.4 version we have to use systemctl status heartbeat.
for heartbeat 6.5 heartbeat-elastic

martinr_ubi · May 20, 2019, 7:33am

True, my bad. The package and service name was renamed in >=6.0.

So I guess you'll need to follow the doc here:
https://www.elastic.co/guide/en/beats/heartbeat/5.6/setup-repositories.html

But in addition since you already have the conflicting software called "Heartbeat High Availability Cluster Communication and Membership" which currently uses the service name "heartbeat" in your system, you'll have to either remove the conflicting software or rename one of the 2 services so they stop conflicting.

I would guess that your are not using the old software that is conflicting, so the easiest fix is probably to remove it and make sure your system has only one thing called "heartbeat". Remove the conflicting software then reinstall heartbeat, did you try that? If you are using it, then you need to rename one of the 2 services so they stop conflicting.

Does that help?
You're suffering from:

Which was fix in heartbeat >=6.0 by renaming the package and service to "heartbeat-elastic".

If the goal of your post was to discuss the error you have in your Elastic heartbeat logs, then the message seems to indicate heartbeat cannot connect to your Elasticsearch
server. It crashes after 2 days of runtime? But from your logs, heartbeat says it cannot connect to elasticsearch, so while it was running for 2 days, was it shipping events into elasticsearch?

system · June 17, 2019, 9:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Heartbeat startup error Beats heartbeat	4	3129	July 31, 2019
Heartbeat-elastic site monitoring error (Web down) Beats heartbeat	3	1593	July 4, 2019
Heartbeat error dashbord Elasticsearch	1	322	August 8, 2019
Failed to start Ping remote services for availability and log results to Elasticsearch or send to Logstash Beats heartbeat	2	1037	December 17, 2022
[SOLVED] New installation gives error Beats heartbeat	2	1381	December 23, 2019

Heartbeat is crashing due to High Availability Cluster Communication

Configure monitors

Configure task schedule

Related topics