Deploying E+K onto K8s cluster, multiple subnets and nodegroups

this is a 2nd chance to get some assistance re: Deploying E+K onto AWD EKS cluster, multiple subnets and nodegroups across 2 az's - #2 by George_Leonard

The thread ask for assistance in deploying the Elastic+Kibana stack onto a K8S hosted at AWS as a EKS cluster. Still finding my feet re AWS and EKS and well pretty noob'ish on Elastic also so allot of learning waiting here, part of the deployment is actually to learn... figure out how things fit together.

or maybe a find a good E+K onto K8S and then figure out what I need to change to make it AWS EKS specific.

G

I'm going to preface this response by saying I've never done what you're asking (regarding subnet separation at the k8s level), and the information I'm providing is how I would go about solving your question; and this is more about providing a high-level concept as much of the question really relates to Kubernetes architecture I believe rather than ECK.

  1. (Preface, I've never used AWS EKS, so information here might vary) I don't think you need to worry about subnets at the Kubernetes layer, generally the Kubernetes cluster will have some sort of cluster network via a network plugin, and it is on this network that the actual pods (in this case Elasticsearch and Kibana) are brought up on.
  2. With the above in mind, the next thing to cover is, how would one go about exposing Kibana to your management network, and Elasticsearch to your app network. From my experience, the way of doing this would be using Kubernetes ingress to get the traffic inside your cluster, and service to route the traffic to the correct location (pods).
    • You would expose one ingress on your app network side, which routes traffic to your Elasticsearch instances
    • You would expose another ingress on your management network side, which would route traffic to your Kibana instances.
    • See these docs to get an idea on how to create the cluster service for handling the actual routing of traffic to intended nodes.

P.S. you should probably close the conversation you linked to and update this post with linked posts content. ECK section is probably the correct section and having 2 open posts about the same thing can split the conversion and make it hard for users to help.

1 Like

ok, some background. with AWS as you aware you have the ability to create multiple subnets and then you also have 1-3 AZ's (availability zones, ak different physical locations).

With EKS you create a cluster (which is managed and then you add worker nodes, during the creation of the worker nodes you specify the subnet, so you can basically for a 3 subnet per az and 2 az's create 6 worker nodes, one on each h of the 6 subnets. You then add labels to the workers,
when you now deploy your pod you can use/specify a nodeSelector. which pushes the pod onto select nodes or az's or both.

For this exercise I will create ay number of nodes to match the desired / best design / layout.

think best plan might be to start there... the cluster/design, we then match the required yaml to deploy the Elastic stack onto it, and then decide/define how/where we want Kibana deployed, For us everything needs to be HA, implying Kibana also, so I'd have a question when we get there, how to deploy multiple Kibana's and have them share their configuration/dashboard layout.

happy to be ginea pig here to get this designed/laid out for others to use/abuse.

G
ps I'll go and close the other thread.

To run multiple Kibana instances, you can just change the spec.count value in the Kibana CDR from 1 (default) to the desired amount of Kibana instances. Kibana doesn't actually store any data, it is stored all in Elasticsearch, so you don't have to worry about sharing configurations between Kibana instances.

Notes:

  1. Technically, while Kibana can run multiple instances for "High Availability", during a version upgrade of Kibana (i.e.: 7.15.0 -> 7.15.1), the ECK operator will take down all Kibana instance at the same time and bring the upgraded versions back up one at time. This is a limitation of Elasticsearch/Kibana, in which only 1 version of Kibana can be active at a time. Therefore, there will be some down time for Kibana during a version upgrade.
  2. I'm not sure I fully understand your desired architecture for this. If possible, I'd recommend you provide an architecture diagram of how you think you want things or provide more detail into the intended setup. (I've never used AWS EKS, so I'm not familiar with all the AWS specific stuff that they offer, and how it might change/affect your desired setup)
  3. If you haven't already, I'd definitely recommend just trying to get a simple cluster working first in AWS EKS, so that you can get an understanding of how to build working configs for all the Elastic stack components you want to use. This way, as you get into the more complex details around your desired setup, you know that you have at least a working configuration for the most part, that should just require minor changes.

P.S.: If you plan on using 2 AZ's, you technically don't have an HA setup. This is because Elasticsearch requires a minimum of 3 master/controller nodes for HA, as at least 2 nodes need to be available for quorum/voting to be handled. If you want to go with 2 AZ's for data nodes, that is fine, but it is recommended to have a 3rd AZ for at least the 3rd master/controller node, so that in the event that one of your AZ's goes down, you have at least 2 master/controller nodes available in your cluster at all times.

... so Kibana stores it's own configuration and the definitions of dashboards etc all inside Elasticsearch ?

ok, agree on the 3 az requirements. true, should have thought about that.

Lets just work on a whats the good right configuration for Elasticsearch... I can them see how I map that t EKS.

can you point me to a simple cluster deployment/yaml.
maybe also point me to a complicated deployment, then I can see baby step to big brother setup.

G

Yes, everything that you see inside Kibana (dashboards, visualizations, etc...) is stored in Elasticsearch.

For a simple cluster deployment, I'd recommend you look at the quickstart guide. Which will go step-by-step deploying a simple cluster on Kubernetes.

For the more advanced stuff, I'd recommend reading over the rest of the ECK docs especially under Orchestrating Elastic Stack applications and Advanced topics.

1 Like

curious, why not make the code available via a git repo... just so much easier.

would be great if there was more examples,

G

where in this command would a nodeSelector clause go, for deployment onto AWS EKS with multiple nodeGroups, configured as different nodes sizes and potentially storage capabilities.

I see below a data node type, is there any examples how a data node can be fined for hot data and for cold/old/infrequent access (assume the pv the node gets assigned to is the easy part, its giving the node the associated personality/assignment).

# Deploy Cluster - Bit more advance
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  nodeSets:
  - name: masters
    count: 3
    config:
      # On Elasticsearch versions before 7.9.0, replace the node.roles configuration with the following:
      # node.master: true
      node.roles: ["master"]
      xpack.ml.enabled: true
      node.remote_cluster_client: false
  - name: data
    count: 10
    config:
      # On Elasticsearch versions before 7.9.0, replace the node.roles configuration with the following:
      # node.master: false
      # node.data: true
      # node.ingest: true
      # node.ml: true
      # node.transform: true
      node.roles: ["data", "ingest", "ml", "transform"]
      node.remote_cluster_client: false
EOF

which of Master, data ingest, ml and transform requires pvc's ?

Found nodeSelector... :slight_smile:
G

Found storage selector... basically another label selector...
be interesting to hear how Elastic engine is told about different storage tiers.

G

... so with the quickstart basic cluster, I'm really thinking there is not enough for me to "stuff" up, but sure enough, kubectl get Elasticsearch just returns HEALTH = unknown and thats where it is stuck...

PS: I did add to the apply a namespace -n elastic-system but can't see that being the problem.

?

G

Can anyone help with figuring out why my quickstart drop is just sitting in pending state...

deployment on AWS EKS.

  1. create cluster
  2. execute EBS role/security implementation for PV's CSI driver
  3. deploy operator
  4. deploy quickstart cluster
    -> pod = pending
    -> health unknown
    -> pvc pending

G

So, your issue seems to be that it is trying to provision a PVC. Question, did you use the provided quick deploy here: Deploy an Elasticsearch cluster | Elastic Cloud on Kubernetes [1.9] | Elastic or did you modify it to include persistent storage?

If you are trying to use persistent storage, I recommend reading these parts of the docs:

I'd also recommend reading over: Persistent Volumes | Kubernetes to get a better understanding of persistent storage within Kubernetes.

Ye... thinking its a PVC problem.

I created a new EKS cluster using eksctl,
then followed AWS CSI driver configuration doc to do that...
Then created operator as per Quickstart | Elastic Cloud on Kubernetes [1.9] | Elastic followed by the step 2 / cluster create.

Nothing else, as nothing else was implied to be done. (found when I do things I think should be needed I end breaking things more)... :frowning: hehehehe

I'm reading and reading and watching videos... from what I can see by not specifying storage class it follows the default configuration which in my case is GP2 via the installed CSI driver.

Some logs below.
G

Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl get elasticsearch
NAME         HEALTH    NODES   VERSION   PHASE             AGE
quickstart   unknown           7.16.1    ApplyingChanges   25h
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl get pv
No resources found
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl get pvc
NAME                                         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
elasticsearch-data-quickstart-es-default-0   Pending                                      gp2            25h
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl get sc
NAME            PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
gp2 (default)   kubernetes.io/aws-ebs   Delete          WaitForFirstConsumer   false                  28h
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl describe pvc
Name:          elasticsearch-data-quickstart-es-default-0
Namespace:     default
StorageClass:  gp2
Status:        Pending
Volume:
Labels:        common.k8s.elastic.co/type=elasticsearch
               elasticsearch.k8s.elastic.co/cluster-name=quickstart
               elasticsearch.k8s.elastic.co/statefulset-name=quickstart-es-default
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       quickstart-es-default-0
Events:
  Type    Reason               Age                     From                         Message
  ----    ------               ----                    ----                         -------
  Normal  WaitForPodScheduled  3m13s (x6143 over 25h)  persistentvolume-controller  waiting for pod quickstart-es-default-0 to be scheduled
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl describe sc
Name:            gp2
IsDefaultClass:  Yes
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner:           kubernetes.io/aws-ebs
Parameters:            fsType=ext4,type=gp2
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     WaitForFirstConsumer
Events:                <none>
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic >
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic > kubectl get pod
NAME                      READY   STATUS    RESTARTS   AGE
quickstart-es-default-0   0/1     Pending   0          25h
Georges-MacBook-Pro.local:/Users/george/Downloads/dev/devlab_aws/elastic >

So, this appears to most probably be an issue with AWS storage provisioning. I have never used this setup, so am not familiar with it, and probably won't be of much help trying to debug it. The only thing I can recommend, is Googling something like AWS EKS gp2 pvc pending and see what information you can find, and do troubleshooting with it.

ye... done the googling thing, the last 4 days...

I'm up against a wall and looking for someone thats done this on EKS as I also think this is AWS EKS PV/PVC/SC specific, just missing something... as I ran into the same problem when I tried to deployed Prometheus via a operator.

G

Since this is mainly a Kubernetes issue, I'd recommend asking a more Kubernetes focused forum as they may be able to provide great assistance, and you can probably find someone who might be able to help. Also, I'm not familiar with AWS, but I'm sure they have support which may be able to help as well.

:slight_smile: - this is not the only place I'm asking for help... just figured as AWS is not small, there might be skills here to help me get this deployed on AWS EKS, as I'm surely not the first.

G

... resolved...
The AWS EKS document that explained how to enable PV storage onto the EKS cluster was "buggy" found another document: Amazon EBS CSI Driver :: Amazon EKS Workshop

That worked. once I followed this document the Elastic cluster deployment worked.

G

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.