Rolling upgrade sequence of node types


(None) #1

Hi, rolling upgrading from 6.1.2 to 6.2 which includes node types: master ingest, coordinator and data and I use free license for monitoring.

I encoutered 2 issues..

1- I had started with the master nodes...
Elastic complained that it could not create the .watcher index. So I reverted and and had to add extra flag to allow me to delete the watcher index...

2- After I reverted the master nodes I tried doing the data nodes first. But the data nodes complained the master nodes where NOT 6.2 and would NOT join the cluster.

Extra note: Had skipped disabling shard allocation.

Extra question: Because its a containerized enviroment. Mesos with Docker and resource contraints. I need to wipe the data node and its data volume completely so the new container can use the resources on that host. Is that ok?


(Mark Walkom) #2

Can you share relevant logs from these things?


(None) #3

Ok I'll get back...


(None) #4

Hi, @warkolm

Trying to upgrade again started with the master nodes...

Going from 6.1.2 to 6.2.2. I have the basic x-pack license.

I got 5 of the below message...

[2018-03-05T00:51:25,599][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:26,333][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:26,951][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:27,628][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:28,643][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:51:29,768][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-03-05T00:57:29,672][INFO ][o.e.c.s.ClusterApplierService] [master] added {{master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [26688]])
[2018-03-05T00:57:33,843][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)

Is that ok?


(None) #5

Every time I do an action I get that message... So for example here I remove a master node and then the exceptio and then the x-pack message again.

[2018-03-05T01:06:33,900][INFO ][o.e.c.s.ClusterApplierService] [master] removed {{master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{9.0.12.130}{9.0.12.130:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [26689]])
[2018-03-05T01:06:33,905][WARN ][o.e.c.NodeConnectionsService] [master] failed to connect to node {master}{xOkFTYh9Rqmh-0T73adWMg}{5_cb0oA4TViDQvPhEYpUcg}{xxxxxx.134}{xxxxxx.134:9300}{ml.machine_memory=4294967296, ml.max_open_jobs=20, ml.enabled=true} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [master][xxxxxx.134:9300] connect_exception
	at org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:331) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:318) ~[elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:154) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:183) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.2.jar:6.2.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: xxxxxx.134/xxxxxx.134:9300
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
	... 1 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:?]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) ~[?:?]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
	... 1 more
[2018-03-05T01:06:34,066][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{master}{AwR3s0L-QIOVSCPYWMuhIw}{KvG8KoaiStmxqWvX0pcOuQ}{xxxxxx.130}{xxxxxx:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)

(Mark Walkom) #6

All nodes needs to have x-pack installed. You can't do that with a rolling restart unfortunately.


(None) #7

I'm pretty sure I do.... Using the basic license. Is there a way to check if all nodes have it?

I'm installing using docker

FROM docker.elastic.co/elasticsearch/elasticsearch-platinum:6.2.2

And logs from one node:

[2018-03-08T05:52:29,368][INFO ][o.e.p.PluginsService     ] [master] loaded plugin [x-pack]
[2018-03-08T05:52:48,343][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/258] [Main.cc@128] controller (64 bit): Version 6.1.2 (Build 258b02b7b1f613) Copyright (c) 2018 Elasticsearch BV
[2018-03-08T05:53:02,231][INFO ][o.e.d.DiscoveryModule    ] [master] using discovery type [zen]
[2018-03-08T05:53:06,754][INFO ][o.e.n.Node               ] [master] initialized
[2018-03-08T05:53:06,754][INFO ][o.e.n.Node               ] [master] starting ...
...
[2018-03-08T05:57:07,334][INFO ][o.e.l.LicenseService     ] [master] license [xxxxxx-xxxx-xxxx-xxxx-xxxxxx] mode [basic] - valid
[2018-03-08T05:57:09,103][INFO ][o.e.c.s.ClusterApplierService] [master] added {{data}{FHbxpH6PRoCGhx0-pt7IaQ}{WDwlSgolT66U6wHKoBiWMA}{xxxxxx.131}{xxxxxx:9300}{ml.machine_memory=51539607552, ml.max_open_jobs=20, ml.enabled=true},}, reason: apply cluster state (from master [master {master}{LPpyHNvVTkWM1BHKIVz1JQ}{kdrPcTnmRkGmsLCaQx7gdw}{xxxxxx.131}{xxxxxx.131:9300}{ml.machine_memory=3221225472, ml.max_open_jobs=20, ml.enabled=true} committed version [34824]])

(None) #8

I think that when you do rolling upgrade, all the masters have to switch to 6.2.2 and turn off the 6.1.2??? Once I did that the error stopped...


(None) #9

Yeah, that did the trick! I'm updated!


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.