Kibana/Elastic Cloud on Kubernetes - Unable to revive connection

Good day!
I am trying to install Elastic Cloud on Kubernetes (https://www.elastic.co/guide/en/cloud-on-k8s/current/index.html), and having issue to get Kibana up and running.

Kibana health status is Red:

NAME HEALTH NODES VERSION AGE
quickstart red 7.1.0 41m

Kibana pod never starts:

NAME READY STATUS RESTARTS AGE
quickstart-kibana-5fd9686dff-9t9jd 0/1 Running 0 5s

The following is logged in the Kibana pod:

{"type":"log","@timestamp":"2019-05-27T15:31:12Z","tags":["warning","elasticsearch","admin"],"pid":1,"message":"Unable to revive connection: https://quickstart-es.elastic-system.svc.cluster.local:9200/"}

The cluster name/domain name is customized and is NOT cluster.local

Testing name resolution from elastic search pod:

[root@quickstart-es-gs478lqblx elasticsearch]# ping quickstart-es.elastic-system.svc.cluster.local
ping: quickstart-es.elastic-system.svc.cluster.local: Name or service not known

[root@quickstart-es-gs478lqblx elasticsearch]# ping quickstart-es
PING quickstart-es.elastic-system.svc.farting.owl (10.233.4.199) 56(84) bytes of data.
64 bytes from quickstart-es.elastic-system.svc.farting.owl (10.233.4.199): icmp_seq=1 ttl=64 time=0.111 ms

The issue seems to be related to name resolution. I believe the kibana/elasticsearch needs to be deployed with the cluster name/domain name in mind, but I tried to edit the Kibana quickstart resource to match the domain name but even after successful modification its always reverted to cluster.local

spec:
elasticsearch:
auth:
secret:
key: kibana-user
name: quickstart-kibana-user
caCertSecret: quickstart-es-ca
url: https://quickstart-es.elastic-system.svc.cluster.local:9200

I also tried to deploy with the url modified in yaml, but it always results in cluster.local

I may need some help here, thanks in advance!

Henro

@tiagocosta can we please get some help?

Thanks,
Bhavya

@henro what version of the stack are you deploying ? 7.x?

Are you able to curl https://quickstart-es.elastic-system.svc.cluster.local:9200 successfully?
Is https://quickstart-es.elastic-system.svc.cluster.local:9200 accessible to kibana? That looks like the problem to me. Also, in case this is a 7.x deployment, are you sure your kibana configuration elasticsearch.hosts usually pointing to localhost:9200 is correctly reaching your elasticsearch deployment?

@tiagocosta version 7.1.0

https://quickstart-es.elastic-system.svc.cluster.local:9200 is not accessible since its not resolvable. I assume that this is the issue since the cluster name is not the default cluster.local

I did not customize elasticsearch.hosts and would probably be set to default, but since the probe fails the pod does not even start.

@henro here you can found a list of sample resources deploy kibana and es to k8s https://github.com/elastic/cloud-on-k8s/tree/0.8/operators/config/samples

In your configuration have you correctly assigned the elasticsearchRef property of Kibana deployment to the correct elasticsearch? You can check it here https://github.com/elastic/cloud-on-k8s/blob/0.8/operators/config/samples/kibana/kibana_es.yaml#L81

@tiagocosta

it seems that the url is always overwritten to *cluster.local no matter how I enter it or edit the kibana resource.
elasticsearchRef is set to quickstart automatically as well which correctly refers to es quicsktart resource

spec:
  elasticsearch:
    auth:
      secret:
        key: kibana-user
        name: quickstart-kibana-user
    caCertSecret: quickstart-es-ca
    url: https://quickstart-es.elastic-system.svc.cluster.local:9200
  elasticsearchRef:
    name: quickstart

it also looks like elastic is not ready and using cluster.local domain

{"level":"error","ts":1559088970.8155234,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"elastic-system/quickstart","error":"Put https://quickstart-es.elastic-system.svc.cluster.local:9200/_cluster/settings: dial tcp: lookup quickstart-es.elastic-system.svc.cluster.local on 10.233.0.3:53: no such host","errorCauses":[{"error":"Put https://quickstart-es.elastic-system.svc.cluster.local:9200/_cluster/settings: dial tcp: lookup quickstart-es.elastic-system.svc.cluster.local on 10.233.0.3:53: no such host"}],"stacktrace":"github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

And this actually happen before Kibana is even created, for some reason the wrong domain name is automatially created as cluster.local

@henro what is the config you are using for your deployment ? Everything should work if you use the sample we provide here https://github.com/elastic/cloud-on-k8s/blob/0.8/operators/config/samples/kibana/kibana_es.yaml (and in that case you will deploy kibana along with elasticsearch)

If you have separated deployments for both you shouldn't need elasticsearchRef and you just need to define your

spec:
  version: "7.1.0"
  elasticsearch:
    url: https://url.to.elasticsearch:9200

like we have here https://github.com/elastic/cloud-on-k8s/blob/0.8/operators/config/samples/kibana/kibana.yaml

Please provide more details about the way you are deploying ES and Kibana and share your k8s configs too please.

Cheers

@tiagocosta
I followed this guide https://www.elastic.co/guide/en/cloud-on-k8s/current/index.html

Install the operator and custom resource definition
kubectl apply -f https://download.elastic.co/downloads/eck/0.8.0/all-in-one.yaml

Install ES

cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 7.1.0
  nodes:
  - nodeCount: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true
EOF

Install Kibana:

cat <<EOF | kubectl apply -f -
apiVersion: kibana.k8s.elastic.co/v1alpha1
kind: Kibana
metadata:
  name: quickstart
spec:
  version: 7.1.0
  nodeCount: 1
  elasticsearchRef:
    name: quickstart
EOF

I tried kibana with explicitly seting url to my cluster domain name but it is always changed automatically.

My K8S cluster is deployed using Kubespray and pretty Vanilla config, but the cluster name has been customized. Currently running v1.14.1.

I can share any K8S config but I am not sure what specifically you are looking for

@henro please just share all the configs you are applying to your elasticsearch and kibana k8s deployment and also let us know what is your custom cluster name

This is a known issue see https://github.com/elastic/cloud-on-k8s/issues/939. The only workaround at the moment is to configure a DNS rewrite in the coredns configmap like so rewrite name substring cluster.local my.name until we fix the issue in the operator.

@henro could you try the workaround from the last post? I was not aware of that issue, but it makes sense. Thanks for bring it up @pebrc

@pebrc @tiagocosta thanks I will try this, but unfortunately I wont have access to this cluster until tomorrow, I will keep you posted.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.