Best Practise for Cluster Restarts

davrob · June 3, 2011, 10:17am

Hi,

When I take my server cluster down, I have noticed that sometimes the
client-only (no data) nodes can not find the new master when it is
starts up (using unicasting). Should I alway take down client nodes
if I have to bring the cluster down altogether.

Best Regards,

David.

dbenson · June 3, 2011, 3:57pm

We use:
curl -XPOST http://localhost:9200/_cluster/nodes/_shutdown

This shuts down ES and all Java client apps (using the ES node)
connected to the cluster.

David

On Jun 3, 4:17 am, davrob2 davirobe...@gmail.com wrote:

Hi,

When I take my server cluster down, I have noticed that sometimes the
client-only (no data) nodes can not find the new master when it is
starts up (using unicasting). Should I alway take down client nodes
if I have to bring the cluster down altogether.

Best Regards,

David.

kimchy · June 3, 2011, 10:21pm

There was a bug in the shutdown API where the client nodes would go down as well. They aren't anymore. Once the cluster is back up, the client nodes should connect to it. What are you using, the TransportClient or NodeClient?

On Friday, June 3, 2011 at 6:57 PM, dbenson wrote:

We use:
curl -XPOST http://localhost:9200/_cluster/nodes/_shutdown

This shuts down ES and all Java client apps (using the ES node)
connected to the cluster.

David

On Jun 3, 4:17 am, davrob2 <davirobe...@gmail.com (http://gmail.com)> wrote:

Hi,

When I take my server cluster down, I have noticed that sometimes the
client-only (no data) nodes can not find the new master when it is
starts up (using unicasting). Should I alway take down client nodes
if I have to bring the cluster down altogether.

Best Regards,

David.

davrob · June 4, 2011, 3:11pm

Hi Shay,

I'm Using the NodeClient like this:

	Node hostNode = NodeBuilder.nodeBuilder()
		.loadConfigSettings(false)
		.settings(settings)
		.node()
		.start();

Client methodLocalClient = hostNode.client();

On Jun 3, 11:21 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

There was a bug in the shutdown API where the client nodes would go down as well. They aren't anymore. Once the cluster is back up, the client nodes should connect to it. What are you using, the TransportClient or NodeClient?

On Friday, June 3, 2011 at 6:57 PM, dbenson wrote:

We use:
curl -XPOSThttp://localhost:9200/_cluster/nodes/_shutdown

This shuts down ES and all Java client apps (using the ES node)
connected to the cluster.

David

On Jun 3, 4:17 am, davrob2 <davirobe...@gmail.com (http://gmail.com)> wrote:

Hi,

When I take my server cluster down, I have noticed that sometimes the
client-only (no data) nodes can not find the new master when it is
starts up (using unicasting). Should I alway take down client nodes
if I have to bring the cluster down altogether.

Best Regards,

David.

kimchy · June 4, 2011, 4:35pm

If its not a client node (NodeBuilder#client(true)), then it will be restarted, yes.

On Saturday, June 4, 2011 at 6:11 PM, davrob2 wrote:

Hi Shay,

I'm Using the NodeClient like this:

Node hostNode = NodeBuilder.nodeBuilder()
.loadConfigSettings(false)
.settings(settings)
.node()
.start();

Client methodLocalClient = hostNode.client();

On Jun 3, 11:21 pm, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

There was a bug in the shutdown API where the client nodes would go down as well. They aren't anymore. Once the cluster is back up, the client nodes should connect to it. What are you using, the TransportClient or NodeClient?

On Friday, June 3, 2011 at 6:57 PM, dbenson wrote:

We use:
curl -XPOSThttp://localhost:9200/_cluster/nodes/_shutdown

This shuts down ES and all Java client apps (using the ES node)
connected to the cluster.

David

On Jun 3, 4:17 am, davrob2 <davirobe...@gmail.com (http://gmail.com)> wrote:

Hi,

When I take my server cluster down, I have noticed that sometimes the
client-only (no data) nodes can not find the new master when it is
starts up (using unicasting). Should I alway take down client nodes
if I have to bring the cluster down altogether.

Best Regards,

David.

Karussell1 · June 4, 2011, 9:51pm

There was a bug in the shutdown API where the client nodes would go down as well. They aren't anymore.

cool.

is it now possible to first start the client (TransportClient) and
then start the ES node? (Sometimes the client is only 1-2 seconds
started earlier and causes a bit trouble ...)

kimchy · June 4, 2011, 9:59pm

It should not be a problem. Do you see a problem now? It should reconnect to the server / cluster once its back up.

On Sunday, June 5, 2011 at 12:51 AM, Karussell wrote:

There was a bug in the shutdown API where the client nodes would go down as well. They aren't anymore.

cool.

is it now possible to first start the client (TransportClient) and
then start the ES node? (Sometimes the client is only 1-2 seconds
started earlier and causes a bit trouble ...)

Karussell1 · June 4, 2011, 10:12pm

Ah, cool it is already possible ... yes it works. Thanks a lot,
Shay!

BTW: in the middle of the node start (the client is already started)
the client reports several times a NPE:

2011-06-05 00:08:51,960 [elasticsearch[cached]-pool-78-thread-1] WARN
org.elasticsearch.client.transport - [Bloodtide] failed to get node
info for [#transport#-1][inet[/127.0.0.1:9300]]
org.elasticsearch.transport.RemoteTransportException: [Impala][inet[/
127.0.0.1:9300]][/cluster/nodes/info]
Caused by: java.lang.NullPointerException
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction
$AsyncAction.start(TransportNodesOperationAction.java:145)
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction
$AsyncAction.access$300(TransportNodesOperationAction.java:103)
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction.doExecute(TransportNodesOperationAction.java:
72)
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction.doExecute(TransportNodesOperationAction.java:
44)
at
org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:
61)
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction
$TransportHandler.messageReceived(TransportNodesOperationAction.java:
226)
at
org.elasticsearch.action.support.nodes.TransportNodesOperationAction
$TransportHandler.messageReceived(TransportNodesOperationAction.java:
218)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:
183)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:
85)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
754)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
302)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:
317)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:
299)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:
216)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:
80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:
754)
at
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:
51)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:
540)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:
261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:
349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:
280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:
200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:
108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)