Using GKE workflow identity to snapshot and restore elasticsearch data

We currently have a GKE cluster and the elastic cloud on kubernetes operator deployed and running. Our elastic resources look as follows:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: map
  namespace: elastic
spec:
  http:
    service:
      spec: {}
    tls:
      certificate: {}
      selfSignedCertificate:
        disabled: true
  nodeSets:
  - config:
      node.data: false
      node.ingest: false
      node.master: true
      node.store.allow_mmap: false
    count: 3
    name: master-nodes
    podTemplate:
      spec:
        initContainers:
        - command:
          - sh
          - -c
          - |
            bin/elasticsearch-plugin install --batch repository-gcs
          name: install-plugins
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
        serviceAccount: elasticsearch
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
        storageClassName: standard
  - config:
      node.data: true
      node.ingest: true
      node.master: false
      node.store.allow_mmap: false
    count: 3
    name: data-nodes
    podTemplate:
      spec:
        initContainers:
        - command:
          - sh
          - -c
          - |
            bin/elasticsearch-plugin install --batch repository-gcs
          name: install-plugins
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
        serviceAccount: elasticsearch
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
        storageClassName: standard
  updateStrategy:
    changeBudget: {}
  version: 7.6.1

and kibana resource looks like:

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: map
  namespace: elastic
spec:
  count: 1
  elasticsearchRef:
    name: map
  http:
    service:
      spec: {}
    tls:
      certificate: {}
      selfSignedCertificate:
        disabled: true
  podTemplate:
    metadata:
      creationTimestamp: {}
    spec:
      containers: {}
      serviceAccount: elasticsearch
  version: 7.6.1

We also have a kubernetes serviceaccount named elasticsearch and this service account has a GCP service account attached to it using the iam.gke.io/gcp-service-account annotation. This identity therefore should be attached with this service account and thus the elasticsearch and kibana pods which run it. However, when we try and setup snapshot and restores in kibana we get the following error log when verifying this:

"java.lang.RuntimePermission" "accessDeclaredMembers"

In addition the elasticsearch pods give us the following output:

{"type": "server", "timestamp": "2020-03-11T09:17:52,209Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "map", "node.name": "map-es-master-nodes-0", "message": "path: /_snapshot/elastic-backup/_verify, params: {repository=elastic-backup}", "cluster.uuid": "myZqUgmpRKam86sYzrpMUQ", "node.id": "am9bO1nFQ76nCvEGGfgvtA" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [map-es-master-nodes-2][172.16.1.233:9300][cluster:admin/repository/verify]",
"Caused by: org.elasticsearch.repositories.RepositoryException: [elastic-backup] cannot create blob store",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.blobStore(BlobStoreRepository.java:424) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:1033) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.RepositoriesService$3.doRun(RepositoriesService.java:246) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]",
"at java.lang.Thread.run(Thread.java:830) [?:?]",
"Caused by: org.elasticsearch.common.blobstore.BlobStoreException: Unable to check if bucket [gs://disco-pre-dev-100-elastic] exists",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.doesBucketExist(GoogleCloudStorageBlobStore.java:118) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.<init>(GoogleCloudStorageBlobStore.java:89) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageRepository.createBlobStore(GoogleCloudStorageRepository.java:94) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageRepository.createBlobStore(GoogleCloudStorageRepository.java:42) ~[?:?]",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.blobStore(BlobStoreRepository.java:420) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:1033) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.RepositoriesService$3.doRun(RepositoriesService.java:246) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]",
"at java.lang.Thread.run(Thread.java:830) ~[?:?]",
"Caused by: java.lang.SecurityException: access denied (\"java.lang.RuntimePermission\" \"accessDeclaredMembers\")",
"at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:?]",
"at java.security.AccessController.checkPermission(AccessController.java:1036) ~[?:?]",
"at java.lang.SecurityManager.checkPermission(SecurityManager.java:408) ~[?:?]",
"at java.lang.Class.checkMemberAccess(Class.java:2848) ~[?:?]",
"at java.lang.Class.getDeclaredFields(Class.java:2247) ~[?:?]",
"at com.google.api.client.util.ClassInfo.<init>(ClassInfo.java:175) ~[?:?]",
"at com.google.api.client.util.ClassInfo.of(ClassInfo.java:90) ~[?:?]",
"at com.google.api.client.util.ClassInfo.<init>(ClassInfo.java:198) ~[?:?]",
"at com.google.api.client.util.ClassInfo.of(ClassInfo.java:90) ~[?:?]",
"at com.google.api.client.util.ClassInfo.<init>(ClassInfo.java:198) ~[?:?]",
"at com.google.api.client.util.ClassInfo.of(ClassInfo.java:90) ~[?:?]",
"at com.google.api.client.util.GenericData.<init>(GenericData.java:74) ~[?:?]",
"at com.google.api.client.util.GenericData.<init>(GenericData.java:55) ~[?:?]",
"at com.google.api.client.http.GenericUrl.<init>(GenericUrl.java:146) ~[?:?]",
"at com.google.api.client.http.GenericUrl.<init>(GenericUrl.java:129) ~[?:?]",
"at com.google.api.client.http.GenericUrl.<init>(GenericUrl.java:102) ~[?:?]",
"at com.google.cloud.ServiceOptions.getAppEngineProjectIdFromMetadataServer(ServiceOptions.java:450) ~[?:?]",
"at com.google.cloud.ServiceOptions.getAppEngineProjectId(ServiceOptions.java:429) ~[?:?]",
"at com.google.cloud.ServiceOptions.getDefaultProjectId(ServiceOptions.java:336) ~[?:?]",
"at com.google.cloud.ServiceOptions.getDefaultProject(ServiceOptions.java:313) ~[?:?]",
"at com.google.cloud.ServiceOptions.<init>(ServiceOptions.java:264) ~[?:?]",
"at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:82) ~[?:?]",
"at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:31) ~[?:?]",
"at com.google.cloud.storage.StorageOptions$Builder.build(StorageOptions.java:77) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageService.createStorageOptions(GoogleCloudStorageService.java:153) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageService.createClient(GoogleCloudStorageService.java:118) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageService.lambda$refreshAndClearCache$0(GoogleCloudStorageService.java:67) ~[?:?]",
"at org.elasticsearch.common.util.LazyInitializable.maybeCompute(LazyInitializable.java:103) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.LazyInitializable.getOrCompute(LazyInitializable.java:81) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageService.client(GoogleCloudStorageService.java:92) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.client(GoogleCloudStorageBlobStore.java:95) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.lambda$doesBucketExist$0(GoogleCloudStorageBlobStore.java:115) ~[?:?]",
"at java.security.AccessController.doPrivileged(AccessController.java:554) ~[?:?]",
"at org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedIOException(SocketAccess.java:44) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.doesBucketExist(GoogleCloudStorageBlobStore.java:115) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.<init>(GoogleCloudStorageBlobStore.java:89) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageRepository.createBlobStore(GoogleCloudStorageRepository.java:94) ~[?:?]",
"at org.elasticsearch.repositories.gcs.GoogleCloudStorageRepository.createBlobStore(GoogleCloudStorageRepository.java:42) ~[?:?]",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.blobStore(BlobStoreRepository.java:420) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:1033) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.repositories.RepositoriesService$3.doRun(RepositoriesService.java:246) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.1.jar:7.6.1]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]",
"at java.lang.Thread.run(Thread.java:830) ~[?:?]"] }

Is this a supported feature within the operator? When reverting to credential files we find it works like the guide suggests but the preference is to workflow identity because of all it's benefits.

If I am reading the plugin docs correctly, this is expected:

The plugin must authenticate the requests it makes to the Google Cloud Storage service. It is common for Google client libraries to employ a strategy named application default credentials. However, that strategy is not supported for use with Elasticsearch. The plugin operates under the Elasticsearch process, which runs with the security manager enabled. The security manager obstructs the "automatic" credential discovery. Therefore, you must configure service account credentials even if you are using an environment that does not normally require this configuration (such as Compute Engine, Kubernetes Engine or App Engine).

Thanks for the response! Yes I believe what I'm looking for might not currently be supported in the plugin but is likely an effect of how the google client libary works. That would be a shame because using workflow identity to authenticate and push items to GCS would be a neet solution.

https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity

Not sure if this is even on the cards for something that repoistory-gcs would like to do at some point though. Might be worth making a feature request is at all possible.

Hi @Anya_Sabo Thanks for your update here.. I'm having similar issue with getting snapshot repository successfully authenticate the GCS bucket using Service Account like you mentioned (following this setup: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-snapshots.html) but it's not working and says "Not Connected" - with the error below:
{
"error": {
"root_cause": [
{
"type": "blob_store_exception",
"reason": "Unable to check if bucket [demo-elk-snapshot-repo] exists"
}
],
"type": "repository_exception",
"reason": "[demo-gcs-snapshot-repo] cannot create blob store",
"caused_by": {
"type": "blob_store_exception",
"reason": "Unable to check if bucket [demo-elk-snapshot-repo] exists",
"caused_by": {
"type": "security_exception",
"reason": "access denied ("java.lang.RuntimePermission" "accessDeclaredMembers")"
}
}
},
"status": 500
}

Do you know what can cause this error? Thanks