APM-Server unable to connect to elasticsearch

Kibana version:
7.13.4
Elasticsearch version:
7.13.4
APM Server version:
7.13.4
APM Agent language and version:
NA
Browser version:
NA
Original install method (e.g. download page, yum, deb, from source, etc.) and version:
Helm
Fresh install or upgraded from other version?
Fresh
Is there anything special in your setup? For example, are you using the Logstash or Kafka outputs? Are you using a load balancer in front of the APM Servers? Have you changed index pattern, generated custom templates, changed agent configuration etc.
NA
Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):
My Elastic Stack:-
3 x Master Pods
2 x Data Pods
2 x Ingest(co-ordinating) Pods
2 x Kibana Pods
1 x Apm-Server Pod

APM logs keep showing unable to connect to Elasticsearch

Steps to reproduce:

  1. Install Apm-server with Helm

Provide logs and/or server output (if relevant):
Although the apm pod status is running but the logs keep showing error as

{"log.level":"error","@timestamp":"2021-07-31T11:54:44.457Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":154},"message":"Failed to connect to backoff(elasticsearch(https://es-client.elasticsearch.svc.cluster.local:9200)): Get \"https://es-client.elasticsearch.svc.cluster.local:9200\": x509: certificate is valid for es-client-1.es-client-service-headless, es-client-service, es-client-service-headless, es-client-1, es-client-1.es-client-service, not es-client.elasticsearch.svc.cluster.local","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-07-31T11:54:44.457Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":145},"message":"Attempting to reconnect to backoff(elasticsearch(https://es-client.elasticsearch.svc.cluster.local:9200)) with 15 reconnect attempt(s)","ecs.version":"1.6.0"}

Below is the values.yml for apm

---
# Allows you to add config files
apmConfig:
  apm-server.yml: |
    apm-server:
      host: "0.0.0.0:8200"
    queue: {}
    output.elasticsearch:
      hosts: ["https://es-client.elasticsearch.svc.cluster.local:9200"]
      username: "${ELASTICSEARCH_USERNAME}"
      password: "${ELASTICSEARCH_PASSWORD}"
      protocol: https
      ssl.enabled: true
      ssl.certificate_authorities: /usr/share/apm-server/config/certs/ca.crt
      ssl.key: /usr/share/apm-server/config/certs/tls.key
      ssl.certificate: /usr/share/apm-server/config/certs/tls.crt

replicas: 1

extraEnvs:
   - name: 'ELASTICSEARCH_USERNAME'
     valueFrom:
       secretKeyRef:
         name: es-credentials
         key: username
   - name: 'ELASTICSEARCH_PASSWORD'
     valueFrom:
       secretKeyRef:
         name: es-credentials
         key: password

image: "470776511283.dkr.ecr.ap-south-1.amazonaws.com/dev-reco-apm"
imageTag: "latest"
imagePullPolicy: "IfNotPresent"
imagePullSecrets: []

managedServiceAccount: true


podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
  runAsGroup: 0

securityContext:
  privileged: false
  runAsNonRoot: true
  runAsUser: 1000
  runAsGroup: 0

livenessProbe:
  httpGet:
    path: /
    port: http
  initialDelaySeconds: 30
  failureThreshold: 3
  periodSeconds: 10
  timeoutSeconds: 5

readinessProbe:
  httpGet:
    path: /
    port: http
  initialDelaySeconds: 30
  failureThreshold: 3
  periodSeconds: 10
  timeoutSeconds: 5

resources:
    requests:
      cpu: "100m"
      memory: "100Mi"
    limits:
      cpu: "200m"
      memory: "512Mi"

secretMounts:
  - name: elastic-certificates
    secretName: es-cert
    path: /usr/share/apm-server/config/certs

terminationGracePeriod: 30

affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - es-data
          topologyKey: kubernetes.io/hostname

updateStrategy:
  type: "RollingUpdate"

autoscaling:
  enabled: true
  minReplicas: 1
  maxReplicas: 3
  averageCpuUtilization: 50

service:
  type: ClusterIP
  loadBalancerIP: ""
  port: 8200
  nodePort: ""
  annotations: {}

Please help us to understand what is wrong. I can provide ES yaml also if needed.

Regards
Nitin G

This secret contains the elastic user credentials

Actually the problem was with the way DNS names were defined in the node certificates. We corrected that and then we got success status in the logs as below

{"log.level":"info","@timestamp":"2021-08-01T11:08:08.352Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/output.go","file.line":151},"message":"Connection to backoff(elasticsearch(https://es-client.es.svc.cluster.local:9200)) established","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-08-01T11:08:19.743Z","log.logger":"request","log.origin":{"file.name":"middleware/log_middleware.go","file.line":63},"message":"request ok","url.original":"/","http.request.method":"GET","user_agent.original":"kube-probe/1.19+","source.address":"192.168.183.140","http.request.body.bytes":0,"http.request.id":"eae1ddab-19f5-426b-848d-c70f26f0b51c","event.duration":126031,"http.response.status_code":200,"ecs.version":"1.6.0"}

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.