Can't upgrade elasticsearch 5.2.1 to 5.6.5

Arter_Xu · January 13, 2018, 12:40pm

Old Version:
Elasticsearch version : 5.2.1
JVM version : 1.8.0_71
OS version : centos 7.3
New Verision:
Elasticsearch version : 5.6.5
JVM version : 1.8.0_131
OS version : centos 7.3

The details:
I have six elasitcsearch clusters of 20 nodes. Plan to upgrade 5.2.1 to 5.26 though the rolling upgrade , reference the https://www.elastic.co/guide/en/elasticsearch/reference/5.6/rolling-upgrades.html.
Five clusters had successfully upgraded ,but one of six elasticsearch clusters can not be upgraded. When one node of the problem cluster is to upgrade ,it cant't join the old cluster.
my english is poor,so sorry.

before upgrade:
curl localhost:9200
{
  "name" : "xg-ops-elk-javaes-mgt-3",
  "cluster_name" : "xg-ops-elk-javaes-cluster",
  "cluster_uuid" : "W_xR97SqQ66yHEq9bQUDZQ",
  "version" : {
    "number" : "5.2.1",
    "build_hash" : "db0d481",
    "build_date" : "2017-02-09T22:05:32.386Z",
    "build_snapshot" : false,
    "lucene_version" : "6.4.1"
  },
  "tagline" : "You Know, for Search"
}
After upgrade and restart:
curl localhost:9200
{
  "name" : "xg-ops-elk-javaes-mgt-3",
  "cluster_name" : "xg-ops-elk-javaes-cluster",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "5.6.5",
    "build_hash" : "6a37571",
    "build_date" : "2017-12-04T07:50:10.466Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}
the log is :
[2018-01-14T16:33:13,125][INFO ][o.e.n.Node               ] [xg-ops-elk-javaes-mgt-3] initialized
[2018-01-14T16:33:13,125][INFO ][o.e.n.Node               ] [xg-ops-elk-javaes-mgt-3] starting ...
[2018-01-14T16:33:13,265][INFO ][o.e.t.TransportService   ] [xg-ops-elk-javaes-mgt-3] publish_address {10.0.23.55:9300}, bound_addresses {127.0.0.1:9300}, {10.0.23.55:9300}
[2018-01-14T16:33:13,274][INFO ][o.e.b.BootstrapChecks    ] [xg-ops-elk-javaes-mgt-3] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2018-01-14T16:33:23,746][INFO ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-3] failed to send join request to master [{xg-ops-elk-javaes-mgt-2}{02U7L_IMSxSnZVyKNGyPKg}{sWovufJERle5VTH1o3s4ww}{10.0.19.68}{10.0.19.68:9300}], reason [RemoteTransportException[[xg-ops-elk-javaes-mgt-2][10.0.19.68:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]]; nested: NullPointerException; ]
[2018-01-14T16:33:34,121][INFO ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-3] failed to send join request to master [{xg-ops-elk-javaes-mgt-2}{02U7L_IMSxSnZVyKNGyPKg}{sWovufJERle5VTH1o3s4ww}{10.0.19.68}{10.0.19.68:9300}], reason [RemoteTransportException[[xg-ops-elk-javaes-mgt-2][10.0.19.68:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]]; nested: NullPointerException; ]
[2018-01-14T16:33:43,293][WARN ][o.e.n.Node               ] [xg-ops-elk-javaes-mgt-3] timed out while waiting for initial discovery state - timeout: 30s
[2018-01-14T16:33:43,303][INFO ][o.e.h.n.Netty4HttpServerTransport] [xg-ops-elk-javaes-mgt-3] publish_address {10.0.23.55:9200}, bound_addresses {127.0.0.1:9200}, {10.0.23.55:9200}
[2018-01-14T16:33:43,303][INFO ][o.e.n.Node               ] [xg-ops-elk-javaes-mgt-3] started
[2018-01-14T16:33:44,475][INFO ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-3] failed to send join request to master [{xg-ops-elk-javaes-mgt-2}{02U7L_IMSxSnZVyKNGyPKg}{sWovufJERle5VTH1o3s4ww}{10.0.19.68}{10.0.19.68:9300}], reason [RemoteTransportException[[xg-ops-elk-javaes-mgt-2][10.0.19.68:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]]; nested: NullPointerException; ]
[2018-01-14T16:33:49,162][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [xg-ops-elk-javaes-mgt-3] no known master node, scheduling a retry
.....
[2018-01-14T16:40:59,129][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [xg-ops-elk-javaes-mgt-3] no known master node, scheduling a retry
[2018-01-14T16:41:08,687][INFO ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-3] failed to send join request to master [{xg-ops-elk-javaes-mgt-2}{02U7L_IMSxSnZVyKNGyPKg}{sWovufJERle5VTH1o3s4ww}{10.0.19.68}{10.0.19.68:9300}], reason [RemoteTransportException[[xg-ops-elk-javaes-mgt-2][10.0.19.68:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]]; nested: NullPointerException; ]
[2018-01-14T16:41:09,134][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [xg-ops-elk-javaes-mgt-3] no known master node, scheduling a retry
[2018-01-14T16:41:09,142][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [xg-ops-elk-javaes-mgt-3] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
[2018-01-14T16:41:09,142][WARN ][r.suppressed             ] path: /_cluster/health, params: {level=indices}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:209) [elasticsearch-5.6.5.jar:5.6.5]
	at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:311) [elasticsearch-5.6.5.jar:5.6.5]
	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:238) [elasticsearch-5.6.5.jar:5.6.5]
	at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1056) [elasticsearch-5.6.5.jar:5.6.5]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.5.jar:5.6.5]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

Please help me, i can't how to slove the issue now ,thanks.

dadoonet · January 13, 2018, 1:42pm

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

Did you upgrade all nodes? What are the other logs?

Arter_Xu · January 14, 2018, 8:52am

what info can i support for you more?

dadoonet · January 14, 2018, 9:29am

What I asked for:

Did you upgrade all nodes? What are the other logs?

Arter_Xu · January 14, 2018, 9:31am

No ,i upgrade the cluster from rolling upgrade, so upgrade one by one ,the other logs have been updated today .
I mean that what other info you need for help me ? thanks

dadoonet · January 14, 2018, 9:33am

The logs of the current master node

Arter_Xu · January 14, 2018, 9:39am

The logs of the current master node is below:

[2018-01-14T17:36:24,392][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
	at org.elasticsearch.cluster.node.DiscoveryNodeFilters.buildFromKeyValue(DiscoveryNodeFilters.java:73) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.cluster.metadata.IndexMetaData$Builder.build(IndexMetaData.java:1044) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.cluster.metadata.IndexMetaData.readFrom(IndexMetaData.java:724) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.cluster.metadata.MetaData.readFrom(MetaData.java:676) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.cluster.ClusterState.readFrom(ClusterState.java:659) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.discovery.zen.MembershipAction$ValidateJoinRequest.readFrom(MembershipAction.java:171) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1510) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1396) ~[elasticsearch-5.2.1.jar:5.2.1]
	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) ~[?:?]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_73]
[2018-01-14T17:36:34,728][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException

Arter_Xu · January 14, 2018, 9:43am

repeat the error ,but i can't unstand where the problem is .

[2018-01-14T17:41:55,105][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:05,469][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:15,797][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:26,198][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:36,521][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:46,872][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
[2018-01-14T17:42:57,229][WARN ][o.e.d.z.ZenDiscovery     ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException

dadoonet · January 14, 2018, 9:57am

What are your elasticsearch.yml settings?

Arter_Xu · January 14, 2018, 10:02am

master node:

cluster.name: xg-ops-elk-javaes-cluster
node.name: xg-ops-elk-javaes-mgt-2
node.master: true
node.data: false
path.data: /data/es_data/
path.logs: /data/es_log/
path.conf: /opt/elasticsearch/config
bootstrap.memory_lock: true
network.host: ["10.0.19.68","127.0.0.1"]
transport.tcp.compress: true
transport.tcp.port: 9300
http.port: 9200
http.max_content_length: 100mb
discovery.zen.ping.unicast.hosts: [xg-ops-elk-javaes-mgt-1,xg-ops-elk-javaes-mgt-2,xg-ops-elk-javaes-mgt-3,'xg-ops-elk-javaes-1:9301', 'xg-ops-elk-javaes-2:9301', 'xg-ops-elk-javaes-3:9301', 'xg-ops-elk-javaes-4:9301', 'xg-ops-elk-javaes-5:9301', 'xg-ops-elk-javaes-6:9301', 'xg-ops-elk-javaes-7:9301', 'xg-ops-elk-javaes-8:9301', 'xg-ops-elk-javaes-9:9301', 'xg-ops-elk-javaes-10:9301', 'xg-ops-elk-javaes-11:9301', 'xg-ops-elk-javaes-12:9301', 'xg-ops-elk-javaes-13:9301', 'xg-ops-elk-javaes-14:9301', 'xg-ops-elk-javaes-15:9301', 'xg-ops-elk-javaes-16:9301', 'xg-ops-elk-javaes-17:9301', 'xg-ops-elk-javaes-18:9301', 'xg-ops-elk-javaes-19:9301', 'xg-ops-elk-javaes-20:9301']
#discovery.zen.ping.unicast.hosts: [xg-ops-elk-javaes-mgt-1,xg-ops-elk-javaes-mgt-2,xg-ops-elk-javaes-mgt-3]
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping_timeout: 60s
cluster.routing.allocation.same_shard.host: true

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTIONS, HEAD, GET, POST
http.cors.allow-headers: X-Requested-With, Content-Type, Content-Length, X-Auth-Token

upgrade node:

cluster.name: xg-ops-elk-javaes-cluster
node.name: xg-ops-elk-javaes-mgt-3
node.master: true
node.data: false
path.data: /data/es_data/
path.logs: /data/es_log/
path.conf: /opt/elasticsearch/config
bootstrap.memory_lock: true
network.host: ["10.0.23.55","127.0.0.1"]
transport.tcp.compress: true
transport.tcp.port: 9300
http.port: 9200
http.max_content_length: 100mb
discovery.zen.ping.unicast.hosts: [xg-ops-elk-javaes-mgt-1,xg-ops-elk-javaes-mgt-2,xg-ops-elk-javaes-mgt-3]
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping_timeout: 10s
cluster.routing.allocation.same_shard.host: true
indices.fielddata.cache.size: 30%

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTIONS, HEAD, GET, POST
http.cors.allow-headers: X-Requested-With, Content-Type, Content-Length, X-Auth-Token

Arter_Xu · January 14, 2018, 10:10am

Other cluster had upgraded successfully which has the same config.
This cluster is strange

dadoonet · January 14, 2018, 11:14am

I'd put

discovery.zen.ping.unicast.hosts: [xg-ops-elk-javaes-mgt-1,xg-ops-elk-javaes-mgt-2,xg-ops-elk-javaes-mgt-3]

On all nodes, not only data nodes.

But that is probably not the issue.

I think that it's related to some index settings like index.routing.allocation.exclude.XXX. Are you using that in your index settings? If so, which one?

Arter_Xu · January 14, 2018, 12:10pm

Yes, i am using the index.routing.allocation.exclude.XXX in some index settings. Maybe it is the issue, i will check it and solve it .
If you had the method of to slove the issue, please tell me.
By the way, is it a bug?
Thank you very much! Good luck.

Arter_Xu · January 14, 2018, 12:46pm

Maybe the problem is related to the template settings.
Because not all nodes have the settings:

node.attr.zone: zone_one
node.attr.disk: ssd
Or 
node.attr.zone: zone_one
node.attr.disk: sata

,i select the include and exlude settings for index:
the template:

default:
{
  "order": 0,
  "template": "*",
  "settings": {
    "index": {
      "routing": {
        "allocation": {
          "exclude": {
            "disk": "sata"
          }
        }
      }...
other tempate:
{
  "order": 1,
  "template": "apm.store_tranport-*",
  "settings": {
    "index": {
      "routing": {
        "allocation": {
          "include": {
            "disk": "sata"
          },
          "exclude": {
            "disk": null
          }
        }
      }
...

I think i can add the node.attr.disk: ssd settings for the nodes needed, then restart the nodes, modify exlude to include which maybe solve the issue.

dadoonet · January 14, 2018, 1:11pm

That'd be awesome to add the settings everywhere and confirm.

Yes to me it's a bug. But let's continue digging before opening the issue.

Thanks for the tests so far!

Arter_Xu · January 14, 2018, 2:29pm

Dear dadoonet:
Thanks for helping me.
I had found the root issue because of your support so far.
The blow is my describe:
Though my test, i prove the template is the root case. Like above default and apm.store_tranport-* template create the index settings is

 "include": {
            "disk": "sata"
          },
          "exclude": {
            "disk": null
          }

but not is

 "include": {
            "disk": "sata"
          }

the "disk": null should not appeare in the index settings.

Now i handle it temporary though the shell:

curl  127.0.0.1:9200/index -d '{
    "settings": {
        "index": {
            "routing": {
                "allocation": {
                    "exclude": {
                        "disk": null
                    }
                }
            }
        }
    }
}'

Then "disk": null disappeare from the index settings, naturally the upgraded node can join the cluster.
In a word, the upgraded node cannot join the cluster because for include or exclude.xxx: null, if include or exclude.xxx is not "null" ,it is nomal.
I would optimize my template settings.
I believe you can fix the issue after .

Thanks again for your help!

dadoonet · January 14, 2018, 4:03pm

I opened this:

system · February 11, 2018, 4:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Upgrade from 5.1.0 to 5.1.1 on Centos6 Makes a node unable to join the cluster Elasticsearch	7	996	January 9, 2017
Elasticsearch upgrade from 6.4.1 to 6.7.0, upgraded node is unable to join the cluster Elasticsearch	11	3575	May 8, 2019
Elasticsearch 5.0.0 alpha2 'NodeNodeAvailableException' Elasticsearch	13	2092	July 5, 2017
Upgrades causing Elastic Search downtime Elasticsearch	9	491	July 6, 2017
Upgrading from 5.6 to 6.x Elasticsearch	4	690	August 4, 2018

Can't upgrade elasticsearch 5.2.1 to 5.6.5

Related topics