Quickstart "health" and "phase" are empty

shirishatideal · September 5, 2019, 6:29am

I am trying ECK but got stuck right at start,

$ kubectl get elasticsearch quickstart
NAME HEALTH NODES VERSION PHASE AGE
quickstart 7.2.0 24m

Kubernetes - v1.15.3
Centos7 AWS Instance (t3.large)

Operator logs show timeout as seems to be trying to pull non-existent GitHub resources
github.com/elastic/cloud-on-k8s/operators/

I can't see "operators" in cloud-on-k8s.

Any guidance on troubleshooting appreciated !

Shirish

Thibault_Richard · September 5, 2019, 9:06am

Hello Shirish,

The 'kubectl describe' command provides often more information to understand what's going on. Can you run it on your elasticsearch resource (kubectl describe elasticsearch) and your pods (kubectl describe pods) and share the outputs?

Here is a documentation to troubleshoot your cluster: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-troubleshooting.html.

You can't see "operators" in the cloud-on-k8s GitHub repository because we moved the content of this directoy up a level (https://github.com/elastic/cloud-on-k8s/pull/1616).

shirishatideal · September 6, 2019, 2:32am

Hello Richard,

Thanks for taking the time to look into this !
I went through the troubleshooting page and found a useful way to enable debug but it wouldn't help me to take it further.

I have issued the following commands and attached the output to https://pastebin.com/UTJiRnUu

#kubectl get all -n elastic-system
#kubectl get events -n elastic-system
#kubectl describe pods -n elastic-system
#kubectl -n elastic-system logs statefulset.apps/elastic-operator
Then "--enable-debug-logs=true" and repeated
#kubectl -n elastic-system logs statefulset.apps/elastic-operator

I suspect the changes in the GitHub file structure hasn't been updated in "code" ,but I might be wrong. Pardon my ignorance as I am just an infra guy
Thanks,
Shirish

Thibault_Richard · September 6, 2019, 7:22am

Hi,

The ECK operator looks healthy but I do not have enough information to debug more. By default, the operator is deployed in the 'elastic-system' namespace and manages Elasticsearch, Kibana and APM server resources in the 'default' namespace.

Can you provide info about the Elasticsearch resource and its associated pods (without filtering with the 'elastic-system' namespace)?

kubectl get elasticsearch
kubectl describe elasticsearch
kubectl get pods
kubectl describe pods
kubectl get events
kubectl describe events

shirishatideal · September 6, 2019, 8:13am

Hello Richard,

The output of the commands uploaded at https://pastebin.com/HQ9LaK5a

Thanks,
Shirish

sarjeet · September 10, 2019, 7:29pm

@shirishatideal hmm, I am not sure if this is the issue but here is my guess:

The ElasticSearch CR configures Version: 7.3.0 which is different from version: 7.2.0 from the quickstart guide. Can you see if the same issue occurs after changing it to 7.2.0?
The error in the operator log suggest some validation failing (AFAIK) from CR and it could be the new field spec.nodes.name missing in the CR. Can you try adding it and see if helps?

Option (2) may not work as operator may be running an older version or older CRDs being submitted in which this field may not exist.

Let me know if either of these options help in troubleshooting your issue.

shirishatideal · September 12, 2019, 2:20am

Hello Sarjeet,

Thanks for your attention.

1.The 7.2.0 version gives the exact same results.
2.I am unsure how to make those changes.

Shirish

sebgl · September 12, 2019, 12:45pm

"Timeout: request did not complete within requested timeout 30s" this seems to be the problem. I think this is an error returned by the apiserver to the operator.
I'm wondering if there might some kind of firewall/network issue preventing the operator to reach the apiserver.

sarjeet · September 12, 2019, 5:23pm

@shirishatideal

Regarding (1), Did you delete the previous CR completely and submitted a new CR with the 7.2.0 version? Can you actually try deleting/cleaning everything and retry with matching every instruction as it is from quickstart if that helps?

For (2), you'll need to add the name to spec.nodes. For example:

nodes:
    - nodeCount: 2
      name: testgroup1
      config:
        node.master: true
        node.data: true
        node.ingest: true

If either of it still does't work, then it could be the setup or environment issue. you can try these on a minikube if that helps and then try debugging on non-working setup step-by-step.

Topic		Replies	Views
Health, node, version and phase seems to be empty and elastic-operation-01 is going into crashbackoffloop Elastic Search	1	11	December 2, 2024
Quickstart Deploy an Elasticsearch cluster get stuck in unknown Health & ApplyingChanges Phase Elastic Cloud on Kubernetes (ECK)	3	2644	March 22, 2021
Single Instance Quickstart Cluster Crashes after 10 Minutes With ECK 0.8.1 Elastic Cloud on Kubernetes (ECK)	3	982	November 4, 2022
Elasticsearch cluster on k8s Stuck Elastic Cloud on Kubernetes (ECK) docker	1	281	May 31, 2023
Trouble with installing ECK on my RKE2 Kubernetes cluster Elastic Cloud on Kubernetes (ECK)	12	1219	June 9, 2023

Quickstart "health" and "phase" are empty

Related topics