Reconciler error when trying to deploy an elasticsearch

Hello,

I've got an issue with the the ECK operator on Openshift when I'm trying to deploy an elasticsearch, when I deploy the elasticsearch object, I've got the following errors on the operator logs:

{"level":"info","@timestamp":"2020-02-10T15:23:21.824Z","logger":"elasticsearch-controller","message":"Ending reconciliation run","ver":"1.0.0-6881438d","iteration":16,"namespace":"mosaic-elk","name":"elasticsearch-sample","took":0.843814306}
{"level":"error","@timestamp":"2020-02-10T15:23:21.824Z","logger":"controller-runtime.controller","message":"Reconciler error","ver":"1.0.0-6881438d","controller":"elasticsearch-controller","request":"mosaic-elk/elasticsearch-sample","error":"the server could not find the requested resource (put elasticsearches.elasticsearch.k8s.elastic.co elasticsearch-sample)","errorCauses":[{"error":"the server could not find the requested resource (put elasticsearches.elasticsearch.k8s.elastic.co elasticsearch-sample)"}],"stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:88"}
{"level":"debug","@timestamp":"2020-02-10T15:23:21.824Z","logger":"controller-runtime.manager.events","message":"Warning","ver":"1.0.0-6881438d","object":{"kind":"Elasticsearch","namespace":"mosaic-elk","name":"elasticsearch-sample","uid":"20adeb99-4c19-11ea-a947-fa163e7e2bbc","apiVersion":"elasticsearch.k8s.elastic.co/v1","resourceVersion":"7997994"},"reason":"ReconciliationError","message":"Reconciliation error: the server could not find the requested resource (put elasticsearches.elasticsearch.k8s.elastic.co elasticsearch-sample)"}

Despite having this error, an elasticsearch pod start:

$ oc get pods
NAME                                READY     STATUS    RESTARTS   AGE
elastic-operator-0                  1/1       Running   0          33m
elasticsearch-sample-es-default-0   1/1       Running   0          32m
$ oc get elasticsearch
NAME                   AGE
elasticsearch-sample   32m

I haven't been able to find any issue on github or subject on this forum discussing this error.

Here's the file I'm using for deploying elasticsearch:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  baseImage: registry/elasticsearch
  tag: v7.5.1
  version: 7.5.1
  nodeSets:
    - name: default
      count: 1
      config:
        node.master: true
        node.data: true
        node.store.allow_mmap: false
      podTemplate:
        metadata:
          labels:
            foo: bar
        spec:
          containers:
            - name: elasticsearch
              image: registry/elasticsearch:v7.5.1
              resources:
                requests:
                  memory: 1Gi
                  cpu: 1
                limits:
                  memory: 2Gi
                  cpu: 1
      volumeClaimTemplates:
      - metadata:
          name: elasticsearch-data
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi
          storageClassName: storage

Anyone ever encountered this issue ?

Thanks.

Hi Florian,

A few questions regarding your issue:

  • What is the version of your client and Openshift cluster ? (the output of "oc version" would be nice)
  • Did any previous version of ECK has been installed on your Openshift cluster ?
  • Did you use the "all-in-one" manifest to install ECK ?

Note that the image can be customized with the image field:

spec:
  version: 7.5.1
  image: registry/elasticsearch:v7.5.1

Thanks

Hi Michael,

First, here's the information about the oc client/cluster:

$ oc version
oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0

openshift v3.11.98
kubernetes v1.11.0+d4cacc0

No previous version of ECK have been installed on the cluster, I'm using the 1.0.0 version.

I've taken the "all-in-one" manifest, indeed. I've simplified a bit the CRDs and modified the ClusterRoles to limit the right of the operator. Heres the templates files I've made:

CRDs/Webhook:

apiVersion: v1
kind: Template
metadata:
  name: rbac-elastic-operator
objects:
  - apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: kibanas.kibana.k8s.elastic.co
    spec:
      group: kibana.k8s.elastic.co
      names:
        kind: Kibana
        listKind: KibanaList
        plural: kibanas
        singular: kibana
      subresources:
        status: {}
      version: v1
      versions:
      - name: v1
        served: true
        storage: true
      - name: v1beta1
        served: true
        storage: false
      - name: v1alpha1
        served: false
        storage: false
        
  - apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: elasticsearches.elasticsearch.k8s.elastic.co
    spec:
      group: elasticsearch.k8s.elastic.co
      names:
        categories:
        - elastic
        kind: Elasticsearch
        listKind: ElasticsearchList
        plural: elasticsearches
        singular: elasticsearch
      version: v1
      versions:
      - name: v1
        served: true
        storage: true
      - name: v1beta1
        served: true
        storage: false
      - name: v1alpha1
        served: false
        storage: false

        
  - apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: apmservers.apm.k8s.elastic.co
    spec:
      group: apm.k8s.elastic.co
      names:
        categories:
        - elastic
        kind: ApmServer
        listKind: ApmServerList
        plural: apmservers
        singular: apmserver
        shortNames:
        - apm
      version: v1
      versions:
      - name: v1
        served: true
        storage: true
      - name: v1beta1
        served: true
        storage: false
      - name: v1alpha1
        served: false
        storage: false

  - apiVersion: admissionregistration.k8s.io/v1beta1
    kind: ValidatingWebhookConfiguration
    metadata:
      name: elastic-webhook.k8s.elastic.co
    webhooks:
      - clientConfig:
          caBundle: Cg==
          service:
            name: elastic-webhook-server
            namespace: ns
            path: /validate-elasticsearch-k8s-elastic-co-v1-elasticsearch
        failurePolicy: Ignore
        name: elastic-es-validation-v1.k8s.elastic.co
        rules:
          - apiGroups:
              - elasticsearch.k8s.elastic.co
            apiVersions:
              - v1
            operations:
              - CREATE
              - UPDATE
            resources:
              - elasticsearches
      - clientConfig:
          caBundle: Cg==
          service:
            name: elastic-webhook-server
            namespace: ns
            path: /validate-elasticsearch-k8s-elastic-co-v1beta1-elasticsearch
        failurePolicy: Ignore
        name: elastic-es-validation-v1beta1.k8s.elastic.co
        rules:
          - apiGroups:
              - elasticsearch.k8s.elastic.co
            apiVersions:
              - v1beta1
            operations:
              - CREATE
              - UPDATE
            resources:
              - elasticsearches        

ClusterRoles:

apiVersion: v1
kind: Template
metadata:
  name: rbac-elastic-operator
objects:
  - apiVersion: authorization.openshift.io/v1
    kind: ClusterRole
    metadata:
      labels:
        role: edit
        customresource: elasticsearch
        api: elasticsearch.k8s.elastic.co
        rbac.authorization.k8s.io/aggregate-to-admin: "true"
        rbac.authorization.k8s.io/aggregate-to-edit: "true"
      name: elasticsearch.elasticsearch.k8s.elastic.co-v1
    rules:
    - apiGroups:
      - elasticsearch.k8s.elastic.co
      attributeRestrictions: null
      resources:
      - '*'
      verbs:
      - '*'
    - apiGroups:
      - admissionregistration.k8s.io
      resources:
      - '*'
      verbs:
      - '*'
    - apiGroups:
      - ""
      resources:
      - events
      verbs:
      - patch
      - create
      
  - apiVersion: authorization.openshift.io/v1
    kind: ClusterRole
    metadata:
      labels:
        role: edit
        customresource: kibana
        api: kibana.k8s.elastic.co
        rbac.authorization.k8s.io/aggregate-to-admin: "true"
        rbac.authorization.k8s.io/aggregate-to-edit: "true"
      name: kibana.kibana.k8s.elastic.co-v1
    rules:
    - apiGroups:
      - kibana.k8s.elastic.co
      attributeRestrictions: null
      resources:
      - '*'
      verbs:
      - '*'
    - apiGroups:
      - admissionregistration.k8s.io
      resources:
      - '*'
      verbs:
      - '*'
      
  - apiVersion: authorization.openshift.io/v1
    kind: ClusterRole
    metadata:
      labels:
        role: edit
        customresource: apmserver
        api: apm.k8s.elastic.co
        rbac.authorization.k8s.io/aggregate-to-admin: "true"
        rbac.authorization.k8s.io/aggregate-to-edit: "true"
      name: apm.apm.k8s.elastic.co-v1
    rules:
    - apiGroups:
      - apm.k8s.elastic.co
      attributeRestrictions: null
      resources:
      - '*'
      verbs:
      - '*'
    - apiGroups:
      - admissionregistration.k8s.io
      resources:
      - '*'
      verbs:
      - '*'

ServiceAccount/ValidatingWebhookConfiguration/Operator/Role/Binding:

apiVersion: v1
kind: Template
metadata:
  name: elastic-operator
objects:
        
  - apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: elastic-operator

  - kind: RoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: elastic-operator
    subjects:
    - kind: ServiceAccount
      name: elastic-operator
    roleRef:
      kind: ClusterRole
      name: admin
      apiGroup: rbac.authorization.k8s.io
 
  - apiVersion: v1
    kind: Secret
    metadata:
      name: elastic-webhook-server-cert
      
  - apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: elastic-operator
      namespace: ns
      labels:
        control-plane: elastic-operator
    spec:
      selector:
        matchLabels:
          control-plane: elastic-operator
      serviceName: elastic-operator
      template:
        metadata:
          labels:
            control-plane: elastic-operator
        spec:
          serviceAccountName: elastic-operator
          containers:
          - name: eck-operator
            image:  registry/eck-operator:v1.0.0
            imagePullPolicy: IfNotPresent
            args: 
              - 'manager'
              - '--operator-roles=namespace'
              - '--operator-namespace=$(WATCH_NAMESPACE)'
              - '--namespaces=$(WATCH_NAMESPACE)'
              - '--log-verbosity=1'
            env:
              - name: WATCH_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
              - name: WEBHOOK_SECRET
                value: elastic-webhook-server-cert
              - name: WEBHOOK_PODS_LABEL
                value: elastic-operator
              - name: OPERATOR_IMAGE
                value: registry/eck-operator:v1.0.0
            ports:
              - containerPort: 9443
                name: webhook-server
                protocol: TCP            
            volumeMounts:
              - mountPath: /tmp/k8s-webhook-server/serving-certs
                name: cert
                readOnly: true
            resources:
              limits:
                cpu: 1
                memory: 150Mi
              requests:
                cpu: 100m
                memory: 50Mi
          terminationGracePeriodSeconds: 10
          volumes:
            - name: cert
              secret:
                defaultMode: 420
                secretName: elastic-webhook-server-cert
                
  - apiVersion: admissionregistration.k8s.io/v1beta1
    kind: ValidatingWebhookConfiguration
    metadata:
      name: elastic-webhook.k8s.elastic.co
    webhooks:
      - clientConfig:
          caBundle: Cg==
          service:
            name: elastic-webhook-server
            namespace: ns
            path: /validate-elasticsearch-k8s-elastic-co-v1-elasticsearch
        failurePolicy: Ignore
        name: elastic-es-validation-v1.k8s.elastic.co
        rules:
          - apiGroups:
              - elasticsearch.k8s.elastic.co
            apiVersions:
              - v1
            operations:
              - CREATE
              - UPDATE
            resources:
              - elasticsearches
      - clientConfig:
          caBundle: Cg==
          service:
            name: elastic-webhook-server
            namespace: ns
            path: /validate-elasticsearch-k8s-elastic-co-v1beta1-elasticsearch
        failurePolicy: Ignore
        name: elastic-es-validation-v1beta1.k8s.elastic.co
        rules:
          - apiGroups:
              - elasticsearch.k8s.elastic.co
            apiVersions:
              - v1beta1
            operations:
              - CREATE
              - UPDATE
            resources:
              - elasticsearches

  - apiVersion: v1
    kind: Service
    metadata:
      name: elastic-webhook-server
    spec:
      ports:
        - port: 443
          targetPort: 9443
      selector:
        control-plane: elastic-operator

Thanks for the tips regarding the image field.

I can't reproduce your issue while I'm using the same (server) version:

Client Version: openshift-clients-4.2.2-201910250432
Kubernetes Version: v1.11.0+d4cacc0

Maybe I'm missing something but it seems that you are binding the ServiceAccount elastic-operator to the admin ClusterRole:

kind: RoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: elastic-operator
    subjects:
    - kind: ServiceAccount
      name: elastic-operator
    roleRef:
      kind: ClusterRole
      name: admin
      apiGroup: rbac.authorization.k8s.io

Is it intended ?

I would first advise to use the role provided to check if it is not the root of the issue.

Hello Michael,

Yes, it is.

I've also tried to deploy the default all-in-one template, the same errors occur.

I've also reproduce the issue on minishift (version 3.11)

I've also tried to deploy the template on a k8s cluster (1.15) and I did not have any errors.

Could this errors be the result of the version of the kubernetes 1.11 cluster ?

Could this errors be the result of the version of the kubernetes 1.11 cluster ?

As I said I can't reproduce you error using the all-in-one manifest on a brand new OCP 3.11 cluster

I have applied your cluster roles and bind the elastic-operator to the admin cluster role and it is still working as expected.

Something odd in your template is this hardcoded namespace in the webhook client config (which is not a namespaced resource):

- apiVersion: admissionregistration.k8s.io/v1beta1
    kind: ValidatingWebhookConfiguration
    metadata:
      name: elastic-webhook.k8s.elastic.co
    webhooks:
      - clientConfig:
          caBundle: Cg==
          service:
            name: elastic-webhook-server
            namespace: ns

Does it exist ? Also note that if ECK is only started with the namespace role it will not start the webhook server.

Could you remove the validation webhook to check that it is not the root cause of the issue ?

I've already removed the validation webhook and the Service elastic-webhook-server. The result is the same.