Hello Team,
First I installed unsecured version of Elasticsearch and Kibana on kubernetes using helm charts, which was working fine. Then I enabled security in both of them. Somehow I managed to resolve issues with Elasticsearch but Kibana is still facing issues. Below are the yaml files for both of them.
My Stack
3 x Master Pods
2 x Data Pods
2 x Ingest(co-oridnating) Pods
2 x Kibana Pods
Ingest Pod values.yaml:-
clusterName: "es"
nodeGroup: "client"
masterService: "es-master-service"
roles:
master: "false"
ingest: "true"
data: "false"
remote_cluster_client: "false"
ml: "false"
replicas: 2
minimumMasterNodes: 2
esConfig:
elasticsearch.yml: |
xpack.security.enabled: true
xpack.monitoring.collection.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: /usr/share/elasticsearch/config/certs-gen/keystore.p12
xpack.security.transport.ssl.truststore.path: /usr/share/elasticsearch/config/certs-gen/keystore.p12
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.truststore.path: /usr/share/elasticsearch/config/certs-gen/keystore.p12
xpack.security.http.ssl.keystore.path: /usr/share/elasticsearch/config/certs-gen/keystore.p12
extraEnvs:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: es-credentials
key: password
- name: ELASTIC_USERNAME
valueFrom:
secretKeyRef:
name: es-credentials
key: username
secretMounts:
- name: elastic-certificates
secretName: es-cert
path: /usr/share/elasticsearch/config/certs
image: "470776511283.dkr.ecr.ap-south-1.amazonaws.com/dev-reco-elasticsearch"
imageTag: "latest"
imagePullPolicy: "IfNotPresent"
esJavaOpts: "-Xms2g -Xmx2g"
resources:
requests:
cpu: "100m"
memory: "2Gi"
limits:
cpu: "300m"
memory: "4Gi"
networkHost: "0.0.0.0"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 30Gi
rbac:
create: false
serviceAccountAnnotations: {}
serviceAccountName: ""
podSecurityPolicy:
create: false
name: ""
spec:
privileged: true
fsGroup:
rule: RunAsAny
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
- emptyDir
persistence:
enabled: false
extraVolumes:
- emptyDir: {}
name: "storage"
- name: tls-certificates
emptyDir: {}
extraVolumeMounts:
- name: storage
mountPath: /data
- name: tls-certificates
mountPath: /usr/share/elasticsearch/config/certs-gen
extraInitContainers:
- name: initial-setup
image: busybox:1.28
command: ["/bin/sh","-c"]
args:
- sysctl -w vm.max_map_count=262144;
sysctl -w fs.file-max=65535;
ulimit -n 65536;
ulimit -u 8192;
securityContext:
privileged: true
- name: setup-tls-cert
image: "470776511283.dkr.ecr.ap-south-1.amazonaws.com/dev-reco-elasticsearch"
command:
- sh
- -c
- |
#!/usr/bin/env bash
set -euo pipefail
elasticsearch-certutil cert \
--name ${NODE_NAME} \
--days 1000 \
--ip ${POD_IP} \
--dns ${NODE_NAME},${POD_SERVICE_NAME},${POD_SERVICE_NAME_HEADLESS},${NODE_NAME}.${POD_SERVICE_NAME},${NODE_NAME}.${POD_SERVICE_NAME_HEADLESS} \
--ca-cert /usr/share/elasticsearch/config/certs/tls.crt \
--ca-key /usr/share/elasticsearch/config/certs/tls.key \
--ca-pass "" \
--pass "" \
--out /usr/share/elasticsearch/config/certs-gen/keystore.p12
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_SERVICE_NAME
value: "es-client-service"
- name: POD_SERVICE_NAME_HEADLESS
value: "es-client-service-headless"
volumeMounts:
- name: elastic-certificates
mountPath: /usr/share/elasticsearch/config/certs
- name: tls-certificates
mountPath: /usr/share/elasticsearch/config/certs-gen
antiAffinityTopologyKey: "kubernetes.io/hostname"
antiAffinity: "hard"
podManagementPolicy: "Parallel"
enableServiceLinks: true
protocol: https
httpPort: 9200
transportPort: 9300
service:
labels: {}
labelsHeadless: {}
type: ClusterIP
nodePort: ""
annotations: {}
httpPortName: http
transportPortName: transport
loadBalancerIP: ""
loadBalancerSourceRanges: []
externalTrafficPolicy: ""
updateStrategy: RollingUpdate
maxUnavailable: 1
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
terminationGracePeriod: 120
sysctlVmMaxMapCount: 262144
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 200
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
clusterHealthCheckParams: "wait_for_status=green&timeout=1s"
ingress:
enabled: false
sysctlInitContainer:
enabled: false
http:
enabled: false
transport:
enabled: false
Kibana values.yaml :-
elasticsearchHosts: "https://es-client.elasticsearch.svc.cluster.local:9200"
replicas: 2
extraEnvs:
- name: 'ELASTICSEARCH_USERNAME'
valueFrom:
secretKeyRef:
name: kibana-system-credentials
key: username
- name: 'ELASTICSEARCH_PASSWORD'
valueFrom:
secretKeyRef:
name: kibana-system-credentials
key: password
- name: 'KIBANA_SECURITY_ENCRYPTION_KEY'
valueFrom:
secretKeyRef:
name: kibana-encrypt-key
key: encryptionkey
- name: 'KIBANA_ENCRYPTEDSAVEDOBJECTS_ENCRYPTION_KEY'
valueFrom:
secretKeyRef:
name: kibana-encryptedsavedobjects-encrypt-key
key: encryptionkey
- name: 'KIBANA_REPORTING_ENCRYPTION_KEY'
valueFrom:
secretKeyRef:
name: kibana-reporting-encrypt-key
key: encryptionkey
- name: "NODE_OPTIONS"
value: "--max-old-space-size=1800"
secretMounts:
- name: kibana-certificates
secretName: kibana-cert
path: /usr/share/kibana/config/certs/kibana
- name: elastic-certificates
secretName: es-cert
path: /usr/share/kibana/config/certs/es
image: "470776511283.dkr.ecr.ap-south-1.amazonaws.com/dev-reco-kibana"
imageTag: "latest"
imagePullPolicy: "IfNotPresent"
labels:
component: elasticsearch
role: kibana
resources:
requests:
cpu: "100m"
memory: "2Gi"
limits:
cpu: "500m"
memory: "4Gi"
protocol: https
serverHost: "0.0.0.0"
healthCheckPath: "/status"
kibanaConfig:
kibana.yml: |-
server.ssl.enabled: true
server.ssl.key: /usr/share/kibana/config/certs/kibana/tls.key
server.ssl.certificate: /usr/share/kibana/config/certs/kibana/tls.crt
xpack.security.encryptionKey: ${KIBANA_SECURITY_ENCRYPTION_KEY}
xpack.encryptedSavedObjects.encryptionKey: ${KIBANA_ENCRYPTEDSAVEDOBJECTS_ENCRYPTION_KEY}
xpack.reporting.encryptionKey: ${KIBANA_REPORTING_ENCRYPTION_KEY}
elasticsearch:
hosts: ${ELASTICSEARCH_HOSTS}
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
ssl:
verificationMode: certificate
key: /usr/share/kibana/config/certs/es/tls.key
certificate: /usr/share/kibana/config/certs/es/tls.crt
certificateAuthorities: /usr/share/kibana/config/certs/es/ca.crt
podSecurityContext:
fsGroup: 1000
securityContext:
capabilities:
drop:
- ALL
# readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
serviceAccount: ""
priorityClassName: ""
httpPort: 5601
extraVolumes: []
extraVolumeMounts: []
extraContainers: ""
extraInitContainers: ""
updateStrategy:
type: "RollingUpdate"
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- es-data
topologyKey: kubernetes.io/hostname
service:
# name: kibana
type: NodePort
loadBalancerIP: ""
port: 5601
nodePort: ""
labels: {}
annotations:
alb.ingress.kubernetes.io/target-type: ip
loadBalancerSourceRanges: []
# 0.0.0.0/0
httpPortName: http
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/group.name: "dev"
alb.ingress.kubernetes.io/group.order: '10'
alb.ingress.kubernetes.io/load-balancer-name: k8s-elk-alb
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80},{"HTTPS": 443}]'
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:ap-south-1:470776511283:certificate/28038375-c1fe-477d-b0e0-a9ab89e84801
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-FS-1-2-Res-2020-10
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/ssl-redirect: '443'
alb.ingress.kubernetes.io/healthcheck-path: /status
alb.ingress.kubernetes.io/healthcheck-protocol: HTTPS
alb.ingress.kubernetes.io/success-codes: 200,302
hosts:
- host: kibana.vinreco.in
paths:
- path: /*
backend:
serviceName: kibana-kibana
servicePort: 80
tls: []
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
I had to change the kibana healthCheckPath
to /status
from /app/kibana
otherwise the pods are not coming up and readiness probe was failing.
Below is log of one of the kibana pod
{"type":"log","@timestamp":"2021-07-31T06:11:40+00:00","tags":["warning","plugins","usageCollection","usage-collection","collector-set"],"pid":952,"message":"StatusCodeError: [illegal_argument_exception] node [es-client-1] does not have the [remote_cluster_client] role\n at respond (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:349:15)\n at checkRespForFailure (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:306:7)\n at HttpConnector.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/connectors/http.js:173:7)\n at IncomingMessage.wrapper (/usr/share/kibana/node_modules/lodash/lodash.js:4991:19)\n at IncomingMessage.emit (events.js:387:35)\n at endReadableNT (internal/streams/readable.js:1317:12)\n at processTicksAndRejections (internal/process/task_queues.js:82:21) {\n status: 400,\n displayName: 'BadRequest',\n path: '/_remote/info',\n query: {},\n body: {\n error: {\n root_cause: [Array],\n type: 'illegal_argument_exception',\n reason: 'node [es-client-1] does not have the [remote_cluster_client] role'\n },\n status: 400\n },\n statusCode: 400,\n response: '{\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"node [es-client-1] does not have the [remote_cluster_client] role\"}],\"type\":\"illegal_argument_exception\",\"reason\":\"node [es-client-1] does not have the [remote_cluster_client] role\"},\"status\":400}',\n toString: [Function (anonymous)],\n toJSON: [Function (anonymous)]\n}"}
{"type":"response","@timestamp":"2021-07-31T06:38:08+00:00","tags":[],"pid":951,"method":"get","statusCode":200,"req":{"url":"/status","method":"get","headers":{"host":"localhost:5601","user-agent":"curl/7.61.1","accept":"*/*"},"remoteAddress":"127.0.0.1","userAgent":"curl/7.61.1"},"res":{"statusCode":200,"responseTime":21,"contentLength":133868},"message":"GET /status 200 21ms - 130.7KB"}
{"type":"response","@timestamp":"2021-07-31T06:38:09+00:00","tags":[],"pid":951,"method":"get","statusCode":302,"req":{"url":"/status","method":"get","headers":{"host":"192.168.173.235:5601","connection":"close","user-agent":"ELB-HealthChecker/2.0","accept-encoding":"gzip, compressed"},"remoteAddress":"192.168.94.180","userAgent":"ELB-HealthChecker/2.0"},"res":{"statusCode":302,"responseTime":2},"message":"GET /status 302 2ms"}
Current cluster status
NAME READY STATUS RESTARTS AGE
pod/es-client-0 1/1 Running 0 19h
pod/es-client-1 1/1 Running 0 19h
pod/es-data-0 1/1 Running 0 19h
pod/es-data-1 1/1 Running 0 19h
pod/es-master-0 1/1 Running 0 19h
pod/es-master-1 1/1 Running 0 19h
pod/es-master-2 1/1 Running 0 19h
pod/kibana-kibana-694b6c7684-7j4bl 1/1 Running 0 39m
pod/kibana-kibana-694b6c7684-ff8zv 1/1 Running 0 40m
Please review and guide weather the security settings are correct? What we can correct to remove the 502 bad gateway error ?
Regards
Nitin G