I am using the Heartbeat version 8.6 and am connecting to self-managed elasticsearch version 8.11.4. Currently, I have over 60 http type monitors running and am successful in viewing their status in kibana. I added one additional tcp type monitor in the heartbeat.yml file, but encountered an error that I need help with.
io:read tcp <source-ip-address-where-heartbeat-is-running>:<random-port> -> <target-hostname1-ip-address>:<target-hostname1-port>: i/o timeout
Snippet from the heartbeat.yml configuration file:
- type: tcp
enabled: true
id: Monitor1
name: MonitorOne
schedule: '@every 120s'
timeout: 60
hosts: ["<target-hostname1>"]
ports: [<target-hostname1-port>]
ssl.enabled: false
check.send: "<Message1>"
check.receive: "True"
- type: http
enabled: true
id: Monitor2
name: MoitorTwo
schedule: '@every 120s'
timeout: 60
urls: ["https://<target-hostname2>/"]
output.elasticsearch:
hosts: ["<elasticsearh>:<port>"]
protocol: "https"
allow_older_versions: true
ssl.verification_mode: "none"
username: "<user-name>"
password: "<password>"
I have an application on server "target-hostname1" that listens on "target-hostname1-port", will accept a string “Message1” and will respond with a string “True”.
From the same source server where the heartbeat is running, I am able to run successful tests to rule out any firewall issues. I confirmed that MonitorOne application can receive and answer with strings over TCP. The response from the MonitorOne application is far below the timeout value.
echo "Message1" | curl -ivk telnet://<target-hostname1>:<target-hostname1-port>
and
echo "Message1" | nc -v <target-hostname1> <target-hostname1-port>
both above commands from the terminal responds with string "True".
From the server logs, I was able to confirm that a request came in with "Message1" and the response was also sent out from the application.
I tried modifying Monitor1 the timeout value to give more time for the "target-hostname1" to respond, but the behavior does not changed. I continuously received the error message with different source port number as follows from the heartbeat:
<source-ip-address-where-heartbeat-is-running>:<random-port-1> -> <target-hostname1-ip-address>:<target-hostname1-port>: i/o timeout
<source-ip-address-where-heartbeat-is-running>:<random-port-2> -> <target-hostname1-ip-address>:<target-hostname1-port>: i/o timeout
<source-ip-address-where-heartbeat-is-running>:<random-port-3> -> <target-hostname1-ip-address>:<target-hostname1-port>: i/o timeout
When I remove the check.send and check.receive from MonitorOne, the heartbeat application considers as MonitorOne to be “up”. This confirms that TCP connection was successful. But when I keep check.send and remove check.receive, I continue to get the i/o time out error in the form of
<source-ip-address-where-heartbeat-is-running>:<random-port> -> <target-hostname1-ip-address>:<target-hostname1-port>: i/o timeout
Can anybody provide some thoughts on why the heartbeat cannot read the resonse from the application that runs on "target-hostname1-ip-address":"target-hostname1-port" ?
I already checked the link Postgresql tcp check is not working, seeking further help.