Gcs-repository-creation-issue for elasticsearch on gke

Kumar_Saurabh_Srivas · September 29, 2019, 6:30am

We are setting up an elasticsearch cluster on GKE with the following format:

Master nodes as kubernetes deployments
Client nodes as kubernetes deployments with HPA
Data nodes as stateful sets with PVs

We are able to set up the cluster well. But then we are struggling in configuring the snapshot backup mechanism. Essentially, we are following this guide. We are able to follow this upto the step of getting the secret json key. Afterwards, we are not sure how to add this to the elasticsearch keystore and proceed further. We are really stuck on this for quite some and the documentation have not been great. All docs mention that add this json key to elasticsearch.keystore but we don't know how to do that. The json file is on our local shell while the keystore is on es pods. Also, we have created a custom dockerfile to install gcs plugin. Really looking for some help here.

DavidTurner · September 29, 2019, 6:55am

Are you using Elastic Cloud on Kubernetes? If so, you can get secrets into the keystore from the Kubernetes secret store.

Kumar_Saurabh_Srivas · September 30, 2019, 2:59am

We aren't using elastic cloud on Kubernetes. We have created a Kubernetes cluster using Elasticsearch docker image.

DavidTurner · September 30, 2019, 6:51am

The orchestration of an Elasticsearch cluster is not simple, and it's easy to get it wrong in ways that occasionally lose data. I recommend using the official operator rather than trying to develop your own orchestration.

If you insist on using your own images, you should use the elasticsearch-keystore command to add any secrets to the keystore.

Kumar_Saurabh_Srivas · September 30, 2019, 8:30am

One of the core reasons why we did not go ahead with the official version is that we were not sure if we can configure it correctly to operate at loads such as 100k RPS for both reads and writes.
It seems that we should really try this out. I would love to hear your suggestions about what sort of configuration should we have. In our existing ES cluster, we have about 500 GB of data and about 100k RPS. We are planning to use 8 vCPU 32 GB machines for our Kubernetes ES cluster so that we can have heap size of about 14-15 GB. Can you suggest some configuration tips / suggestions based on your experience with this operator. Also, how does the operator
take care of autoscaling, especially of data and client nodes here?

DavidTurner · September 30, 2019, 11:17am

I do not think the orchestration mechanism should have any impact on performance. You should get the same cluster however it's orchestrated.

Benchmark your setup with a realistic workload. That's the only way you can truly validate its performance characteristics. Our public benchmarks show performance on some workloads in excess of 100k per second on a three-node benchmarking cluster, but performance is very dependent on your workload and hardware so you must perform your own experiments.

I don't think there's any auto-scaling yet. It doesn't seem necessary in such a small cluster.

Kumar_Saurabh_Srivas · September 30, 2019, 2:09pm

I tried crating a cluster. But some of my pods are being OOM killed which is strange given that I am using 8vCPU 30 GB machines. Ideally, I would have expected this to work well given the resources I am using.
Below is my elasticsearch.yaml:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
namespace: elastic-system
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
name: es-cluster
namespace: elastic-system
spec:
version: 7.2.0
nodes:

3 dedicated master nodes

nodeCount: 3
config:
node.master: true
node.data: false
node.ingest: false
cluster.remote.connect: false
podTemplate:
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
elasticsearch.k8s.elastic.co/cluster-name: es-cluster
topologyKey: kubernetes.io/hostname
nodeSelector:
cloud.google.com/gke-nodepool: es-pool
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=2621441
securityContext:
privileged: true
- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin install --batch repository-gcs
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms5g -Xmx5g
- name: NSS_SDB_USE_CACHE
value: "no"
resources:
requests:
memory: 2Gi
limits:
cpu: 4
memory: 5Gi

3 coordinating nodes

nodeCount: 3
config:
node.master: false
node.data: false
node.ingest: false
podTemplate:
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
elasticsearch.k8s.elastic.co/cluster-name: es-cluster
topologyKey: kubernetes.io/hostname
nodeSelector:
cloud.google.com/gke-nodepool: es-pool
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=2621441
securityContext:
privileged: true
- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin install --batch repository-gcs
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms7g -Xmx7g
- name: NSS_SDB_USE_CACHE
value: "no"
resources:
requests:
memory: 5Gi
limits:
cpu: 4
memory: 6Gi

4 data nodes

nodeCount: 4
config:
node.master: false
node.data: true
node.ingest: false
podTemplate:
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
elasticsearch.k8s.elastic.co/cluster-name: es-cluster
topologyKey: kubernetes.io/hostname
nodeSelector:
cloud.google.com/gke-nodepool: es-pool
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=2621441
securityContext:
privileged: true
- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin install --batch repository-gcs
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms5g -Xmx5g
- name: NSS_SDB_USE_CACHE
value: "no"
resources:
requests:
memory: 5Gi
limits:
cpu: 4
memory: 5Gi
volumeClaimTemplates:
- metadata:
  name: elasticsearch-data
  spec:
  accessModes:
  - ReadWriteOnce
    resources:
    requests:
    storage: 200Gi
    storageClassName: local-storage
    updateStrategy:
    changeBudget:
    maxSurge: 1
    maxUnavailable: 0
    secureSettings:
    secretName: gcs-credentials
    http:
    service:
    spec:
    type: ClusterIP

DavidTurner · September 30, 2019, 2:25pm

Please use the </> button to format any YAML you are sharing properly. YAML is whitespace-sensitive and if you don't format it properly then it's quite meaningless.

You should read the guide to setting the heap size, in particular:

Set Xmx and Xms to no more than 50% of your physical RAM. Elasticsearch requires memory for purposes other than the JVM heap and it is important to leave space for this...

Here "physical RAM" means "RAM allocated to the container". Your coordinating nodes have a 7GB heap on a 6GB container which is completely hopeless, and the other containers have heap size equal to container RAM which is still off by a factor of 2.

Kumar_Saurabh_Srivas · September 30, 2019, 3:14pm

Sorry for the bad format. But thanks a lot. The cluster is up and running. I am going ahead with configuring snapshots now.

Kumar_Saurabh_Srivas · September 30, 2019, 5:42pm

I have updated it and the cluster is running fine. I have created a clusterIP service:

NAME                              TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service/elastic-webhook-service   ClusterIP   10.64.4.214   <none>        443/TCP    4h10m
service/es-cluster-es-http        ClusterIP   10.64.7.198   <none>        9200/TCP   163m

But when I ssh into a node of the Kubernetes cluster (in another node pool which is not running the ES cluster) and I try curl commands, I get no reply from server.
curl: (52) Empty reply from server

Any idea what is happening? Not sure if my cluster is running. How do I create indexes and insert data?
curl -X GET "10.64.7.198:9200/_cluster/health?pretty":
curl: (52) Empty reply from server

kubernetes -n elastic-system get elasticsearch:

NAME         HEALTH   NODES   VERSION   PHASE         AGE
es-cluster   green    10      7.2.0     Operational   2h

DavidTurner · September 30, 2019, 5:54pm

That sounds like possibly a network config issue, but I'm not the best person to ask about this. I've moved this post over to the ECK forum and hopefully someone else can help with the details here.

Anya_Sabo · September 30, 2019, 6:10pm

Hello, please see the docs here:
https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-accessing-elastic-services.html
for how to access it. In your case it looks like you will need to specify https and authenticate to connect to the ES cluster. If you run into any issues please let us know, we try to document the common ones people run into.

Kumar_Saurabh_Srivas · October 1, 2019, 4:14am

Yeah. Got it working. Thanks a lot. This entire thread has been super useful. My cluster is up and running. I just have one concern here that there is no auto scaling. In case my pods are reaching CPU limits, I don't have something like a Kubernetes Horizontal Pod Autoscaler to automatically schedule more pods. I will have to monitor the cluster myself for each of the possible bottlenecks and then scale my cluster manually.

system · October 29, 2019, 4:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to take regular elasticsearch index backup to GCS Elasticsearch	5	550	May 30, 2019
Elasticsearch backup GCP plugin Elasticsearch	9	818	April 19, 2019
Backup of Elasticsearch cluster running in Kubernetes (GKE) Elasticsearch	2	626	February 18, 2020
Dec 18th, 2020: [EN] Set up searchable snapshots in ECK Advent Calendar	1	1364	January 15, 2021
Manual setup ElasticSearch on google cloud Elasticsearch	1	489	July 29, 2019

Gcs-repository-creation-issue for elasticsearch on gke

3 dedicated master nodes

3 coordinating nodes

4 data nodes

Related topics