Rolling upgrade sequence of node types

Hi, rolling upgrading from 6.1.2 to 6.2 which includes node types: master ingest, coordinator and data and I use free license for monitoring.

I encoutered 2 issues..

1- I had started with the master nodes...
Elastic complained that it could not create the .watcher index. So I reverted and and had to add extra flag to allow me to delete the watcher index...

2- After I reverted the master nodes I tried doing the data nodes first. But the data nodes complained the master nodes where NOT 6.2 and would NOT join the cluster.

Extra note: Had skipped disabling shard allocation.

Extra question: Because its a containerized enviroment. Mesos with Docker and resource contraints. I need to wipe the data node and its data volume completely so the new container can use the resources on that host. Is that ok?

Can you share relevant logs from these things?

Ok I'll get back...

Hi, @warkolm

Trying to upgrade again started with the master nodes...

Going from 6.1.2 to 6.2.2. I have the basic x-pack license.

I got 5 of the below message...

[2018-03-05T00:51:25,599][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:26,333][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:26,951][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:27,628][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:28,643][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:29,768][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:57:29,672][INFO ][o.e.c.s.ClusterApplierService] [master] added {{master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [26688]])
[2018-03-05T00:57:33,843][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)

Is that ok?

Every time I do an action I get that message... So for example here I remove a master node and then the exceptio and then the x-pack message again.

[2018-03-05T01:06:33,900][INFO ][o.e.c.s.ClusterApplierService] [master] removed {{master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{9.0.12.130}{9.0.12.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [26689]])
[2018-03-05T01:06:33,905][WARN ][o.e.c.NodeConnectionsService] [master] failed to connect to node {master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [master][xxxxxx.134:9300] connect_exception
	at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:331) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:318) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:154) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:183) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: xxxxxx.134/xxxxxx.134:9300
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
	... 1 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
	... 1 more
[2018-03-05T01:06:34,066][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)

All nodes needs to have x-pack installed. You can't do that with a rolling restart unfortunately.

I'm pretty sure I do.... Using the basic license. Is there a way to check if all nodes have it?

I'm installing using docker

FROM docker.elastic.co/elasticsearch/elasticsearch-platinum:6.2.2

And logs from one node:

[2018-03-08T05:52:29,368][INFO ][o.e.p.PluginsService     ] [master] loaded plugin [x-pack]
[2018-03-08T05:52:48,343][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/258] [Main.cc@128] controller (64 bit): Version 6.1.2 (Build 258b02b7b1f613) Copyright (c) 2018 Elasticsearch BV
[2018-03-08T05:53:02,231][INFO ][o.e.d.DiscoveryModule    ] [master] using discovery type [zen]
[2018-03-08T05:53:06,754][INFO ][o.e.n.Node               ] [master] initialized
[2018-03-08T05:53:06,754][INFO ][o.e.n.Node               ] [master] starting ...
...
[2018-03-08T05:57:07,334][INFO ][o.e.l.LicenseService     ] [master] license [xxxxxx-xxxx-xxxx-xxxx-xxxxxx] mode [basic] - valid
[2018-03-08T05:57:09,103][INFO ][o.e.c.s.ClusterApplierService] [master] added {{data}{FHbxpH6PRoCGhx0-pt7IaQ}{WDwlSgolT66U6wHKoBiWMA}{xxxxxx.131}{xxxxxx:9300}{ml.machine_memory=51539607552, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{LPpyHNvVTkWM1BHKIVz1JQ}{kdrPcTnmRkGmsLCaQx7gdw}{xxxxxx.131}{xxxxxx.131:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [34824]])

I think that when you do rolling upgrade, all the masters have to switch to 6.2.2 and turn off the 6.1.2??? Once I did that the error stopped...

Yeah, that did the trick! I'm updated!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.