Hi,
We use hearbeat to monitor a hundred applications. In general it works well, but sometimes some applications are down (only for a single ping) according to the tool.
However, this is not the case according to our tests. The error is always the following one: dial tcp xx.xx.xxx.xx: 443: i / o timeout (Client.Timeout exceeded while awaiting headers)
The only similar topic we found was this one: Heartbeat timeout.
Regarding our infrastructures, hearbeat is located on a pod (under docker.elastic.co/beats/heartbeat:7.14.1), in a k3s cluster.
The logs, in debug mode, don't give us any additional information.
We thought about a network problem, and verify this by dumping the network in the heartbeat pod.
For example, we had a timeout on the document API, at endpoint /document/system/health at 08:40.
Dump:
469 2021-09-26 08:39:00,075550 HTTP 281 GET /oauth-sap/system/info/ HTTP/1.1
471 2021-09-26 08:39:00,077059 HTTP 296 GET /comptabilite-auxilliaire/system/info/ HTTP/1.1
473 2021-09-26 08:39:00,077448 HTTP 281 GET /stats-sia/system/info/ HTTP/1.1
478 2021-09-26 08:39:00,078396 HTTP 287 GET /activedirectory/system/info/ HTTP/1.1
479 2021-09-26 08:39:00,078451 HTTP 278 GET /pulpes/system/info/ HTTP/1.1
481 2021-09-26 08:39:00,078612 HTTP 277 GET /proxy/system/info/ HTTP/1.1
482 2021-09-26 08:39:00,078618 HTTP 279 GET /graines/system/info/ HTTP/1.1
484 2021-09-26 08:39:00,078856 HTTP 297 GET /indicateurconditionnement/system/info/ HTTP/1.1
485 2021-09-26 08:39:00,078908 HTTP 283 GET /login-extra/system/info/ HTTP/1.1
487 2021-09-26 08:39:00,079479 HTTP 268 GET /mapproxy/ HTTP/1.1
492 2021-09-26 08:39:00,079648 HTTP 282 GET /login-adfs/system/info/ HTTP/1.1
494 2021-09-26 08:39:00,079781 HTTP 294 GET /transporteur-reporting/system/info/ HTTP/1.1
496 2021-09-26 08:39:00,079915 HTTP 278 GET /divers/system/info/ HTTP/1.1
497 2021-09-26 08:39:00,079936 HTTP 283 GET /facturation/system/info/ HTTP/1.1
499 2021-09-26 08:39:00,080053 HTTP 293 GET /livraisons-receptions/system/info/ HTTP/1.1
500 2021-09-26 08:39:00,080114 HTTP 286 GET /sync-carto-sia/system/info/ HTTP/1.1
501 2021-09-26 08:39:00,080463 HTTP 278 GET /paille/system/info/ HTTP/1.1
509 2021-09-26 08:39:00,080906 HTTP 286 GET /capital-social/system/info/ HTTP/1.1
--510 2021-09-26 08:39:00,080906 HTTP 281 GET /document/system/health HTTP/1.1
513 2021-09-26 08:39:00,080985 HTTP 283 GET /partenaires/system/info/ HTTP/1.1
515 2021-09-26 08:39:00,081009 HTTP 283 GET /analyselabo/system/info/ HTTP/1.1
516 2021-09-26 08:39:00,081165 HTTP 289 GET /droits-production/system/info/ HTTP/1.1
517 2021-09-26 08:39:00,081180 HTTP 291 GET /planning-betteraves/system/info/ HTTP/1.1
518 2021-09-26 08:39:00,081234 HTTP 280 GET /enquetes/system/info/ HTTP/1.1
519 2021-09-26 08:39:00,081304 HTTP 291 GET /negoces-prestations/system/info/ HTTP/1.1
520 2021-09-26 08:39:00,081349 HTTP 285 GET /notifications/system/info/ HTTP/1.1
521 2021-09-26 08:39:00,081377 HTTP 282 GET /erc-config/system/info/ HTTP/1.1
528 2021-09-26 08:39:00,081992 HTTP 292 GET /autorisationcuws-sap/system/info/ HTTP/1.1
529 2021-09-26 08:39:00,081999 HTTP 281 GET /videcache/system/info/ HTTP/1.1
530 2021-09-26 08:39:00,082002 HTTP 290 GET /administration-ihm/system/info/ HTTP/1.1
531 2021-09-26 08:39:00,082047 HTTP 299 GET /parcelles-api/actuator/health/readiness/ HTTP/1.1
533 2021-09-26 08:39:00,082164 HTTP 284 GET /uidparameter/system/info/ HTTP/1.1
535 2021-09-26 08:39:00,082200 HTTP 284 GET /fin-campagne/system/info/ HTTP/1.1
536 2021-09-26 08:39:00,082289 HTTP 281 GET /agronomie/system/info/ HTTP/1.1
537 2021-09-26 08:39:00,082382 HTTP 273 GET /geoserver/web/ HTTP/1.1
539 2021-09-26 08:39:00,090326 HTTP 284 GET /transporteur/system/info/ HTTP/1.1
684 2021-09-26 08:40:00,089762 HTTP 281 GET /agronomie/system/info/ HTTP/1.1
689 2021-09-26 08:40:00,091605 HTTP 273 GET /geoserver/web/ HTTP/1.1
690 2021-09-26 08:40:00,091612 HTTP 283 GET /facturation/system/info/ HTTP/1.1
691 2021-09-26 08:40:00,091617 HTTP 294 GET /transporteur-reporting/system/info/ HTTP/1.1
692 2021-09-26 08:40:00,091622 HTTP 283 GET /partenaires/system/info/ HTTP/1.1
694 2021-09-26 08:40:00,092540 HTTP 281 GET /stats-sia/system/info/ HTTP/1.1
699 2021-09-26 08:40:00,092646 HTTP 285 GET /notifications/system/info/ HTTP/1.1
701 2021-09-26 08:40:00,092737 HTTP 292 GET /autorisationcuws-sap/system/info/ HTTP/1.1
702 2021-09-26 08:40:00,092818 HTTP 284 GET /uidparameter/system/info/ HTTP/1.1
703 2021-09-26 08:40:00,092822 HTTP 293 GET /livraisons-receptions/system/info/ HTTP/1.1
704 2021-09-26 08:40:00,092841 HTTP 297 GET /indicateurconditionnement/system/info/ HTTP/1.1
711 2021-09-26 08:40:00,093343 HTTP 282 GET /login-adfs/system/info/ HTTP/1.1
712 2021-09-26 08:40:00,093357 HTTP 286 GET /sync-carto-sia/system/info/ HTTP/1.1
713 2021-09-26 08:40:00,093460 HTTP 281 GET /oauth-sap/system/info/ HTTP/1.1
716 2021-09-26 08:40:00,093497 HTTP 291 GET /negoces-prestations/system/info/ HTTP/1.1
718 2021-09-26 08:40:00,093546 HTTP 291 GET /planning-betteraves/system/info/ HTTP/1.1
719 2021-09-26 08:40:00,093555 HTTP 282 GET /erc-config/system/info/ HTTP/1.1
720 2021-09-26 08:40:00,093564 HTTP 299 GET /parcelles-api/actuator/health/readiness/ HTTP/1.1
722 2021-09-26 08:40:00,093693 HTTP 286 GET /capital-social/system/info/ HTTP/1.1
723 2021-09-26 08:40:00,093738 HTTP 277 GET /proxy/system/info/ HTTP/1.1
724 2021-09-26 08:40:00,093783 HTTP 289 GET /droits-production/system/info/ HTTP/1.1
732 2021-09-26 08:40:00,094621 HTTP 283 GET /analyselabo/system/info/ HTTP/1.1
733 2021-09-26 08:40:00,094665 HTTP 296 GET /comptabilite-auxilliaire/system/info/ HTTP/1.1
734 2021-09-26 08:40:00,094804 HTTP 290 GET /administration-ihm/system/info/ HTTP/1.1
735 2021-09-26 08:40:00,094830 HTTP 283 GET /login-extra/system/info/ HTTP/1.1
736 2021-09-26 08:40:00,094842 HTTP 279 GET /graines/system/info/ HTTP/1.1
737 2021-09-26 08:40:00,094859 HTTP 284 GET /fin-campagne/system/info/ HTTP/1.1
738 2021-09-26 08:40:00,095018 HTTP 278 GET /paille/system/info/ HTTP/1.1
742 2021-09-26 08:40:00,095262 HTTP 278 GET /pulpes/system/info/ HTTP/1.1
743 2021-09-26 08:40:00,095270 HTTP 280 GET /enquetes/system/info/ HTTP/1.1
744 2021-09-26 08:40:00,095311 HTTP 287 GET /activedirectory/system/info/ HTTP/1.1
749 2021-09-26 08:40:00,096401 HTTP 278 GET /divers/system/info/ HTTP/1.1
750 2021-09-26 08:40:00,096411 HTTP 268 GET /mapproxy/ HTTP/1.1
751 2021-09-26 08:40:00,096419 HTTP 281 GET /videcache/system/info/ HTTP/1.1
752 2021-09-26 08:40:00,096424 HTTP 284 GET /transporteur/system/info/ HTTP/1.1
899 2021-09-26 08:41:00,091295 HTTP 280 GET /enquetes/system/info/ HTTP/1.1
901 2021-09-26 08:41:00,091833 HTTP 268 GET /mapproxy/ HTTP/1.1
904 2021-09-26 08:41:00,092356 HTTP 283 GET /partenaires/system/info/ HTTP/1.1
905 2021-09-26 08:41:00,092459 HTTP 290 GET /administration-ihm/system/info/ HTTP/1.1
907 2021-09-26 08:41:00,092900 HTTP 284 GET /uidparameter/system/info/ HTTP/1.1
911 2021-09-26 08:41:00,093250 HTTP 286 GET /capital-social/system/info/ HTTP/1.1
913 2021-09-26 08:41:00,093369 HTTP 284 GET /fin-campagne/system/info/ HTTP/1.1
915 2021-09-26 08:41:00,093398 HTTP 292 GET /autorisationcuws-sap/system/info/ HTTP/1.1
916 2021-09-26 08:41:00,093420 HTTP 284 GET /transporteur/system/info/ HTTP/1.1
917 2021-09-26 08:41:00,093460 HTTP 281 GET /oauth-sap/system/info/ HTTP/1.1
920 2021-09-26 08:41:00,093827 HTTP 281 GET /stats-sia/system/info/ HTTP/1.1
921 2021-09-26 08:41:00,093975 HTTP 283 GET /analyselabo/system/info/ HTTP/1.1
--924 2021-09-26 08:41:00,094259 HTTP 281 GET /document/system/health HTTP/1.1
937 2021-09-26 08:41:00,095226 HTTP 299 GET /parcelles-api/actuator/health/readiness/ HTTP/1.1
939 2021-09-26 08:41:00,095331 HTTP 291 GET /negoces-prestations/system/info/ HTTP/1.1
940 2021-09-26 08:41:00,095353 HTTP 278 GET /divers/system/info/ HTTP/1.1
943 2021-09-26 08:41:00,095538 HTTP 278 GET /paille/system/info/ HTTP/1.1
944 2021-09-26 08:41:00,095595 HTTP 294 GET /transporteur-reporting/system/info/ HTTP/1.1
945 2021-09-26 08:41:00,095618 HTTP 281 GET /videcache/system/info/ HTTP/1.1
946 2021-09-26 08:41:00,095628 HTTP 283 GET /facturation/system/info/ HTTP/1.1
947 2021-09-26 08:41:00,095709 HTTP 293 GET /livraisons-receptions/system/info/ HTTP/1.1
951 2021-09-26 08:41:00,095756 HTTP 283 GET /login-extra/system/info/ HTTP/1.1
952 2021-09-26 08:41:00,095829 HTTP 281 GET /agronomie/system/info/ HTTP/1.1
956 2021-09-26 08:41:00,095954 HTTP 277 GET /proxy/system/info/ HTTP/1.1
958 2021-09-26 08:41:00,096673 HTTP 287 GET /activedirectory/system/info/ HTTP/1.1
959 2021-09-26 08:41:00,096738 HTTP 279 GET /graines/system/info/ HTTP/1.1
960 2021-09-26 08:41:00,096808 HTTP 278 GET /pulpes/system/info/ HTTP/1.1
961 2021-09-26 08:41:00,096886 HTTP 282 GET /login-adfs/system/info/ HTTP/1.1
962 2021-09-26 08:41:00,096976 HTTP 282 GET /erc-config/system/info/ HTTP/1.1
963 2021-09-26 08:41:00,097097 HTTP 273 GET /geoserver/web/ HTTP/1.1
964 2021-09-26 08:41:00,097271 HTTP 296 GET /comptabilite-auxilliaire/system/info/ HTTP/1.1
965 2021-09-26 08:41:00,097391 HTTP 297 GET /indicateurconditionnement/system/info/ HTTP/1.1
966 2021-09-26 08:41:00,097824 HTTP 285 GET /notifications/system/info/ HTTP/1.1
967 2021-09-26 08:41:00,097970 HTTP 289 GET /droits-production/system/info/ HTTP/1.1
968 2021-09-26 08:41:00,098565 HTTP 291 GET /planning-betteraves/system/info/ HTTP/1.1
969 2021-09-26 08:41:00,098897 HTTP 286 GET /sync-carto-sia/system/info/ HTTP/1.1
Unfortunately the dump shows us that the request does not come out of the pod at 08:40. It's a problem related to the heartbeat pod.
Do you have any ideas to solve this problem ?
Thanks,
Guillaume