The 'kubectl describe' command provides often more information to understand what's going on. Can you run it on your elasticsearch resource (kubectl describe elasticsearch) and your pods (kubectl describe pods) and share the outputs?
Thanks for taking the time to look into this !
I went through the troubleshooting page and found a useful way to enable debug but it wouldn't help me to take it further.
#kubectl get all -n elastic-system #kubectl get events -n elastic-system #kubectl describe pods -n elastic-system #kubectl -n elastic-system logs statefulset.apps/elastic-operator
Then "--enable-debug-logs=true" and repeated #kubectl -n elastic-system logs statefulset.apps/elastic-operator
I suspect the changes in the GitHub file structure hasn't been updated in "code" ,but I might be wrong. Pardon my ignorance as I am just an infra guy
Thanks,
Shirish
The ECK operator looks healthy but I do not have enough information to debug more. By default, the operator is deployed in the 'elastic-system' namespace and manages Elasticsearch, Kibana and APM server resources in the 'default' namespace.
Can you provide info about the Elasticsearch resource and its associated pods (without filtering with the 'elastic-system' namespace)?
kubectl get elasticsearch
kubectl describe elasticsearch
kubectl get pods
kubectl describe pods
kubectl get events
kubectl describe events
@shirishatideal hmm, I am not sure if this is the issue but here is my guess:
The ElasticSearch CR configures Version: 7.3.0 which is different from version: 7.2.0 from the quickstart guide. Can you see if the same issue occurs after changing it to 7.2.0?
The error in the operator log suggest some validation failing (AFAIK) from CR and it could be the new field spec.nodes.name missing in the CR. Can you try adding it and see if helps?
Option (2) may not work as operator may be running an older version or older CRDs being submitted in which this field may not exist.
Let me know if either of these options help in troubleshooting your issue.
"Timeout: request did not complete within requested timeout 30s" this seems to be the problem. I think this is an error returned by the apiserver to the operator.
I'm wondering if there might some kind of firewall/network issue preventing the operator to reach the apiserver.
Regarding (1), Did you delete the previous CR completely and submitted a new CR with the 7.2.0 version? Can you actually try deleting/cleaning everything and retry with matching every instruction as it is from quickstart if that helps?
For (2), you'll need to add the name to spec.nodes. For example:
If either of it still does't work, then it could be the setup or environment issue. you can try these on a minikube if that helps and then try debugging on non-working setup step-by-step.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.