Hi, I started a rolling upgrade as per directions from the guide. I have 6 data nodes (n1 and n2 are also ingest nodes) and after successfully upgrading n4 I upgraded n6 and the cluster is unable to get from yellow to green. After working through the "Red or yellow cluster status" page it looks like there is no place to allocate a replica of one of the shards for a new index.
I am running with the options
- cluster.routing.allocation.awareness.force.rack.values: r1, r2, r3
- cluster.routing.allocation.awareness.attributes: rack
and node.attr.rack options for my nodes are configured as:
- n1 node.attr.rack [r1]
- n2 node.attr.rack [r2]
- n3 node.attr.rack [r1]
- n4 node.attr.rack [r2]
- n5 node.attr.rack [r1]
- n6 node.attr.rack [r2]
cluster allocation explanation gives me:
n1 - node_version (can't allocate replica shard to older version)
n2 - node_version (can't allocate replica shard to older version) and awareness (already copy on r2 and per cluster routing allocation awareness the second copy needs to be on r1 or r3)
n3 - node_version (can't allocate replica shard to older version)
n4 - same_shard (already holds a copy) and awareness (already copy on r2 and per cluster routing allocation awareness the second copy needs to be on r1 or r3)
n5 - node_version (can't allocate replica shard to older version)
n6 - awareness (already copy on r2 and per cluster routing allocation awareness the second copy needs to be on r1 or r3)
My mistake in is that without thinking I started the upgrade with two consecutive r2 nodes i.e., n4 and n6 and if I had done one form r2 and another from r1 e.g., n4 and n5 I would not be in this mess at the moment.
Also, after the upgrade there is an issue on n6 with JVM heap memory. When I run /usr/share/elasticsearch/bin/elasticsearch --version I get:
Exception in thread "main" java.lang.RuntimeException: starting java failed with [1]
output:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 34359738368 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /var/log/elasticsearch/hs_err_pid53775.log
error:
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00007fe0b4000000, 34359738368, 0) failed; error='Not enough space' (errno=12)
at org.elasticsearch.tools.launchers.JvmOption.flagsFinal(JvmOption.java:119)
at org.elasticsearch.tools.launchers.JvmOption.findFinalOptions(JvmOption.java:81)
at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:38)
at org.elasticsearch.tools.launchers.JvmOptionsParser.jvmOptions(JvmOptionsParser.java:135)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:86)
Despite the error the node seems to work and it is joined to the cluster. The node has 64GB memory and jvm.options already have -Xms32g and -Xmx32g.
So, I have two issues. For the first one - what are my options?
- Can I just upgrade n5 (or n1 or n3) to 7.17.7 despite the yellow cluster status?
- Do I change the cluster routing allocation settings temporarily?
What about the second issue? I can't just throw more RAM at it, can't I?
I hope to get my cluster to green so I can get on with upgrading. My final goal is to upgrade to 8.5