I'm using ECK Operator 1.0.0-beta1 running on Rancher 2.0.
I have a custom image for Elasticsearch which adds an off-cluster NFS share for snapshot backups. This capability works correctly, but when I go to upgrade the cluster (such as from 7.4.0 to 7.4.1) I see the following behavior:
Kubernetes tries to remove the last node in the cluster
This seems to timeout which results in the pod being killed (I think)
Then the entire cluster detects "Readiness Probe Failed" and falls over
The cluster comes back on it's own, and the killed node now has the new version
Repeat for every node in the cluster
No data is lost during this, but the cluster restarts once for every pod.
The Dockerfile looks like this:
FROM docker.elastic.co/elasticsearch/elasticsearch:7.4.1
RUN yum -y install nfs-utils
RUN mkdir /mnt/snapshots
COPY ./my-start.sh /usr/local/bin/my-start.sh
ENTRYPOINT ["/usr/local/bin/my-start.sh"]
The my-start.sh script adds a mount command before sourcing the original entrypoint:
You could maybe get rid of the custom Docker image by:
adding an init container that does the mount
using your preStop hook to do the umount
This way you don't have to deal with building your own image and keeping it up-to-date.
Why would a shutdown timeout of a single instance cause the entire cluster to flap?
This is not expected. I'd like to understand it better.
Can you share your Elasticsearch yaml manifest?
What do you mean with the entire cluster detects "Readiness Probe Failed" and falls over? All Pods become non-ready so the service cannot route to the cluster?
Can you share some logs of the operator and Elasticsearch while this happens?
I've been able to recreate this issue using a custom image, but the problem goes away when I use lifecycle exec commands.
To demonstrate the behavior I've made a short video which starts when I apply a change from 7.4.0 to 7.4.1: https://youtu.be/4icmwoyN8uY
(I have operator log files if you're interested in chasing this behavior down - but I think it comes down to my image not exiting cleanly.)
I have eliminated this behavior by moving my logic to lifecycle exec commands like so:
cat <<EOF | kubectl -n test apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.4.0
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
path.repo: [ "/var/local" ]
xpack.security.authc.realms:
native:
native1:
order: 1
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
limits:
memory: 2G
cpu: 2
env:
- name: ES_JAVA_OPTS
value: "-Xms1g -Xmx1g"
securityContext:
capabilities:
add:
- SYS_ADMIN
# Important: You must mount to a path which already exists in the image, because postStart executes too late to create the mount point.
# I used /var/local because it was empty and seemed reasonable.
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
yum -y install nfs-utils &&
mount -vvv -t nfs -o nolock nfs-server:/volume1/search-quickstart /var/local
preStop:
exec:
command: ["/usr/bin/umount", "/var/local" ]
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: local-path
EOF
The one problem with this, as a solution, is that now I have to install a bunch of packages everytime I initialize a pod. But it's a cleaner solution than a custom image overall.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.