Pod is not running

Good morning,

We try to update kubernetes version from 1.19 to 1.20, and when try to renew an Elasticsearch data pod, this data pod is not runnning, showing this error in logs:

{"type": "server", "timestamp": "2022-06-15T08:56:41,620Z", "level": "ERROR", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "failed to download database [GeoLite2-Country.mmdb]",
{"type": "server", "timestamp": "2022-06-15T08:56:41,620Z", "level": "ERROR", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "failed to download database [GeoLite2-City.mmdb]",
{"type": "server", "timestamp": "2022-06-15T08:56:41,620Z", "level": "ERROR", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "failed to download database [GeoLite2-ASN.mmdb]",
{"type": "server", "timestamp": "2022-06-15T08:56:42,873Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "uncaught exception in thread [main]", "cluster.uuid": "jdhhplRoQ4aDyxUTyG_Fdw", "node.id": "zmTKRR-vRbOUcsuahwheeA" ,
{"type": "server", "timestamp": "2022-06-15T08:56:42,948Z", "level": "ERROR", "component": "o.e.x.d.l.DeprecationIndexingComponent", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "Bulk write of deprecation logs encountered some failures: [[GWuUZoEBBnz_bbyDSBe3 NodeClosedException[node closed {elasticsearch-es-data-node-4}{zmTKRR-vRbOUcsuahwheeA}{fP3r9cCuTuCGMk7JVmJZzw}{10.36.252.244}{10.36.252.244:9300}{dimr}{k8s_node_name=ip-10-36-252-252.eu-west-1.compute.internal, xpack.installed=true, transform.node=false}], GmuUZoEBBnz_bbyDSBe6 NodeClosedException[node closed {elasticsearch-es-data-node-4}{zmTKRR-vRbOUcsuahwheeA}{fP3r9cCuTuCGMk7JVmJZzw}{10.36.252.244}{10.36.252.244:9300}{dimr}{k8s_node_name=ip-10-36-252-252.eu-west-1.compute.internal, xpack.installed=true, transform.node=false}]]]", "cluster.uuid": "jdhhplRoQ4aDyxUTyG_Fdw", "node.id": "zmTKRR-vRbOUcsuahwheeA"  }
{"type": "server", "timestamp": "2022-06-15T08:56:43,093Z", "level": "ERROR", "component": "i.n.u.c.D.rejectedExecution", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-node-4", "message": "Failed to submit a listener notification task. Event loop shut down?", "cluster.uuid": "jdhhplRoQ4aDyxUTyG_Fdw", "node.id": "zmTKRR-vRbOUcsuahwheeA" ,

Here the running phase:

elasticsearch-es-data-node-4                             0/1     Init:2/5           0          37s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Init:3/5           0          38s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Init:3/5           0          39s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Init:4/5           0          42s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     PodInitializing    0          43s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            0          44s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              0          87s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            1          88s     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              1          2m11s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   1          2m23s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            2          2m23s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              2          3m8s    10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   2          3m23s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            3          3m36s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              3          4m19s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   3          4m34s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            4          5m      10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              4          5m44s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   4          5m57s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            5          7m12s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              5          7m55s   10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   5          8m6s    10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            6          10m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              6          11m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   6          11m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Running            7          16m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     Error              7          17m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>
elasticsearch-es-data-node-4                             0/1     CrashLoopBackOff   7          17m     10.36.252.244   ip-10-36-252-252.eu-west-1.compute.internal   <none>           <none>

and the pod description:

  Normal   Scheduled               8m59s                  default-scheduler                                     Successfully assigned default/elasticsearch-es-data-node-4 to ip-10-36-252-252.eu-west-1.compute.internal
  Normal   SuccessfulAttachVolume  8m48s                  attachdetach-controller                               AttachVolume.Attach succeeded for volume "pvc-5a9bc55f-816f-4142-9385-65a6520d09ff"
  Normal   Pulling                 8m42s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Pulling image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2"
  Normal   Pulled                  8m34s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Successfully pulled image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" in 8.074480126s
  Normal   Started                 8m30s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container elastic-internal-init-filesystem
  Normal   Created                 8m30s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container elastic-internal-init-filesystem
  Normal   Started                 8m29s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container elastic-internal-init-keystore
  Normal   Created                 8m29s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container elastic-internal-init-keystore
  Normal   Pulled                  8m29s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" already present on machine
  Normal   Pulled                  8m23s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" already present on machine
  Normal   Created                 8m23s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container elastic-internal-suspend
  Normal   Started                 8m23s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container elastic-internal-suspend
  Normal   Pulled                  8m22s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" already present on machine
  Normal   Started                 8m22s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container install-plugins
  Normal   Created                 8m22s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container install-plugins
  Normal   Pulled                  8m18s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" already present on machine
  Normal   Created                 8m18s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container sysctl
  Normal   Started                 8m18s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container sysctl
  Normal   Pulled                  8m17s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.16.2" already present on machine
  Normal   Created                 8m17s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Created container elasticsearch
  Normal   Started                 8m17s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Started container elasticsearch
  Warning  Unhealthy               8m4s                   kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:17+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               7m59s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:22+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               7m54s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:27+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               7m49s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:32+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               7m44s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:37+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               7m39s                  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  Readiness probe failed: {"timestamp": "2022-06-15T08:46:42+00:00", "message": "readiness probe failed", "curl_rc": "7"}
  Warning  Unhealthy               3m39s (x22 over 7m9s)  kubelet, ip-10-36-252-252.eu-west-1.compute.internal  (combined from similar events): Readiness probe failed: {"timestamp": "2022-06-15T08:50:42+00:00", "message": "readiness probe failed", "curl_rc": "7"}

Some idea of these error? thanks

More information about error:

uncaught exception in thread [main]
BindTransportException[Failed to resolve publish address]; nested: UnknownHostException[elasticsearch-es-data-node-1.elasticsearch-es-data-node.default.svc: Temporary failure in name resolution];
Likely root cause: java.net.UnknownHostException: elasticsearch-es-data-node-1.elasticsearch-es-data-node.default.svc: Temporary failure in name resolution
        at java.base/java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:933)
        at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1519)
        at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:852)
        at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1509)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1367)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1301)
        at org.elasticsearch.common.network.NetworkService.resolveInternal(NetworkService.java:270)
        at org.elasticsearch.common.network.NetworkService.resolveInetAddresses(NetworkService.java:218)
        at org.elasticsearch.common.network.NetworkService.resolvePublishHostAddresses(NetworkService.java:170)
        at org.elasticsearch.http.AbstractHttpServerTransport.bindServer(AbstractHttpServerTransport.java:168)
        at org.elasticsearch.http.netty4.Netty4HttpServerTransport.doStart(Netty4HttpServerTransport.java:255)
        at org.elasticsearch.xpack.security.transport.netty4.SecurityNetty4HttpServerTransport.doStart(SecurityNetty4HttpServerTransport.java:78)
        at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:48)
        at org.elasticsearch.node.Node.start(Node.java:1267)
        at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:335)
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:443)
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166)
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
        at org.elasticsearch.cli.Command.main(Command.java:77)
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122)
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80)
For complete error details, refer to the log at /usr/share/elasticsearch/logs/elasticsearch.log

Cluster conf:

cluster:
  name: elasticsearch
  routing:
    allocation:
      awareness:
        attributes: k8s_node_name
discovery:
  seed_providers: file
http:
  publish_host: ${POD_NAME}.${HEADLESS_SERVICE_NAME}.${NAMESPACE}.svc
indices:
  memory:
    index_buffer_size: 20%
    min_index_buffer_size: 96mb
network:
  host: "0"
  publish_host: ${POD_IP}
node:
  attr:
    k8s_node_name: ${NODE_NAME}
  name: ${POD_NAME}
  roles:
  - master
  - data
  - ingest
  - remote_cluster_client
path:
  data: /usr/share/elasticsearch/data
  logs: /usr/share/elasticsearch/logs
xpack:
  license:
    upload:
      types:
      - trial
      - enterprise
  security:
    authc:
      realms:
        file:
          file1:
            order: -100
        ldap:
          ldap1:
            bind_dn: dn
            files:
              role_mapping: /usr/share/elasticsearch/config/role-mapping/role-mapping.yml
            group_search:
              base_dn: dn
            order: 1
            secure_bind_password: xpack.security.authc.realms.ldap.ldap1.secure_bind_password
            ssl:
              certificate_authorities:
              - /usr/share/elasticsearch/config/ldap-certs/ldap_ca.pem
              verification_mode: certificate
            unmapped_groups_as_roles: false
            url: ldaps://ldaps:636
            user_search:
              base_dn: dn
              filter: (sAMAccountName={0})
        native:
          native1:
            order: -99
      reserved_realm:
        enabled: "false"
    enabled: "true"
    http:
      ssl:
        certificate: /usr/share/elasticsearch/config/http-certs/tls.crt
        certificate_authorities: /usr/share/elasticsearch/config/http-certs/ca.crt
        enabled: true
        key: /usr/share/elasticsearch/config/http-certs/tls.key
    transport:
      ssl:
        certificate: /usr/share/elasticsearch/config/node-transport-cert/transport.tls.crt
        certificate_authorities:
        - /usr/share/elasticsearch/config/transport-certs/ca.crt
        - /usr/share/elasticsearch/config/transport-remote-certs/ca.crt
        enabled: "true"
        key: /usr/share/elasticsearch/config/node-transport-cert/transport.tls.key
        verification_mode: certificate

Sorry, coredns of k8s is the problem, last version is not working fine...

I think i have the same kind of issues. How do you solve the issue with coredns ?
I'am running my cluster on microk8s ...

Facing similar issue with kubeadm kubernetes installation , any one got any solution ? Please reply

uncaught exception in thread [main]
org.elasticsearch.transport.BindTransportException: Failed to resolve publish address
Likely root cause: java.net.UnknownHostException: quickstart-es-default-0.quickstart-es-default.observability.svc
at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:948)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1628)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1494)
at org.elasticsearch.common.network.NetworkService.resolveInternal(NetworkService.java:267)
at org.elasticsearch.common.network.NetworkService.resolveInetAddresses(NetworkService.java:215)
at org.elasticsearch.common.network.NetworkService.resolvePublishHostAddresses(NetworkService.java:167)
at org.elasticsearch.http.AbstractHttpServerTransport.bindServer(AbstractHttpServerTransport.java:171)
at org.elasticsearch.http.netty4.Netty4HttpServerTransport.doStart(Netty4HttpServerTransport.java:249)
at org.elasticsearch.xpack.security.transport.netty4.SecurityNetty4HttpServerTransport.doStart(SecurityNetty4HttpServerTransport.java:78)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:48)
at org.elasticsearch.node.Node.start(Node.java:1246)
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:272)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:367)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160)
at org.elasticsearch.common.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:81)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
at org.elasticsearch.cli.Command.main(Command.java:77)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80)
For complete error details, refer to the log at /usr/share/elasticsearch/logs/quickstart.log

Sorry for delay. We upgrade coredns version and into upgrade process we don't apply a needed configmap conf change. After apply this change, coredns running well and now the pods reolve the pod dns and the communicaion works fine.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.