Shards unassigned after service restart when disconnected from the network

Simon_Orr · January 24, 2013, 4:30pm

We have three nodes working together in a cluster. There are 2 indices with
10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after a
restart when not connected to the other nodes?

--

radu_gheorghe · January 29, 2013, 8:31am

Hello Simon,

I know it's kind of a long shot, but maybe what you're experiencing has to
do with the laptop having no network during restart. You can try and fix
that by adding/setting "network.host: 127.0.0.1" in your config. Then
restart Elasticsearch.

And when you connect the laptop back to the "home cluster", I guess it will
be a matter of reverting the config changes and restarting ES again.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 24, 2013 at 6:30 PM, Simon Orr simon@exonar.com wrote:

We have three nodes working together in a cluster. There are 2 indices
with 10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after a
restart when not connected to the other nodes?

--

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Simon_Orr · January 29, 2013, 9:30am

Hi Radu,

Thanks for the suggestion but I don't think that's it. I should've
specified in my first post but the ES instance on the laptop is running
inside a VM so even when the laptop is disconnected from the network, it
has the same IP address, as provided by the VM Host's virtual adapter.

Thanks for taking the time to reply.

On 29 January 2013 08:31, Radu Gheorghe radu.gheorghe@sematext.com wrote:

Hello Simon,

I know it's kind of a long shot, but maybe what you're experiencing has to
do with the laptop having no network during restart. You can try and fix
that by adding/setting "network.host: 127.0.0.1" in your config. Then
restart Elasticsearch.

And when you connect the laptop back to the "home cluster", I guess it
will be a matter of reverting the config changes and restarting ES again.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 24, 2013 at 6:30 PM, Simon Orr simon@exonar.com wrote:

We have three nodes working together in a cluster. There are 2 indices
with 10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after a
restart when not connected to the other nodes?

--

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

radu_gheorghe · January 29, 2013, 1:08pm

Hi Simon,

Could you double-check? Because I'd suspect it depends on your VM's network
configuration. Unless you have NAT with port forwarding, I don't see a
situation where disconnecting the "host" network will keep the ES default
configuration happy.

Anyway, if I'm not on the right track, I'd suggest you try restarting ES
with debug logging activated and see if you get any clues from there.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jan 29, 2013 at 11:30 AM, Simon Orr simon@exonar.com wrote:

Hi Radu,

Thanks for the suggestion but I don't think that's it. I should've
specified in my first post but the ES instance on the laptop is running
inside a VM so even when the laptop is disconnected from the network, it
has the same IP address, as provided by the VM Host's virtual adapter.

Thanks for taking the time to reply.

On 29 January 2013 08:31, Radu Gheorghe radu.gheorghe@sematext.comwrote:

Hello Simon,

I know it's kind of a long shot, but maybe what you're experiencing has
to do with the laptop having no network during restart. You can try and fix
that by adding/setting "network.host: 127.0.0.1" in your config. Then
restart Elasticsearch.

And when you connect the laptop back to the "home cluster", I guess it
will be a matter of reverting the config changes and restarting ES again.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 24, 2013 at 6:30 PM, Simon Orr simon@exonar.com wrote:

We have three nodes working together in a cluster. There are 2 indices
with 10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after a
restart when not connected to the other nodes?

--

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Simon_Orr · January 29, 2013, 3:12pm

Hi Radu,

The VM is set up with bridge networking, the simulated "cable" is always
plugged in. The IP Addresses are all statically assigned eg.

Host: 10.0.0.100 (Virtual NIC), 10.0.0.50 (Physical NIC)
VM: 10.0.0.200 (Gateway set to: 10.0.0.100)
Other ES Servers: 10.0.0.201, 10.0.0.202

When the Host is disconnected, the VM is still 10.0.0.200 and can still see
the host on 10.0.0.100 but nothing beyond that. As far as the Linux VM is
concerned, the network just got a little smaller.

That said, I will give it another go. Meanwhile, can you point me at the
documentation on how to enable Debug mode?

Thanks again.

On 29 January 2013 13:08, Radu Gheorghe radu.gheorghe@sematext.com wrote:

Hi Simon,

Could you double-check? Because I'd suspect it depends on your VM's
network configuration. Unless you have NAT with port forwarding, I don't
see a situation where disconnecting the "host" network will keep the ES
default configuration happy.

Anyway, if I'm not on the right track, I'd suggest you try restarting ES
with debug logging activated and see if you get any clues from there.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jan 29, 2013 at 11:30 AM, Simon Orr simon@exonar.com wrote:

Hi Radu,

Thanks for the suggestion but I don't think that's it. I should've
specified in my first post but the ES instance on the laptop is running
inside a VM so even when the laptop is disconnected from the network, it
has the same IP address, as provided by the VM Host's virtual adapter.

Thanks for taking the time to reply.

On 29 January 2013 08:31, Radu Gheorghe radu.gheorghe@sematext.comwrote:

Hello Simon,

I know it's kind of a long shot, but maybe what you're experiencing has
to do with the laptop having no network during restart. You can try and fix
that by adding/setting "network.host: 127.0.0.1" in your config. Then
restart Elasticsearch.

And when you connect the laptop back to the "home cluster", I guess it
will be a matter of reverting the config changes and restarting ES again.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 24, 2013 at 6:30 PM, Simon Orr simon@exonar.com wrote:

We have three nodes working together in a cluster. There are 2 indices
with 10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after
a restart when not connected to the other nodes?

--

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

radu_gheorghe · January 29, 2013, 9:07pm

Hello Simon,

OK, I see now.

To change logging settings, take a look at logging.yml from the config
directory. That's Elasticsearch's log4j configuration - so you can refer to
the log4j documentation to get a complete understanding. But once you open
the file it's pretty straightforward - you'll probably spot the places
where INFO can be replaced with DEBUG, for example.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jan 29, 2013 at 5:12 PM, Simon Orr simon@exonar.com wrote:

Hi Radu,

The VM is set up with bridge networking, the simulated "cable" is always
plugged in. The IP Addresses are all statically assigned eg.

Host: 10.0.0.100 (Virtual NIC), 10.0.0.50 (Physical NIC)
VM: 10.0.0.200 (Gateway set to: 10.0.0.100)
Other ES Servers: 10.0.0.201, 10.0.0.202

When the Host is disconnected, the VM is still 10.0.0.200 and can still
see the host on 10.0.0.100 but nothing beyond that. As far as the Linux VM
is concerned, the network just got a little smaller.

That said, I will give it another go. Meanwhile, can you point me at the
documentation on how to enable Debug mode?

Thanks again.

On 29 January 2013 13:08, Radu Gheorghe radu.gheorghe@sematext.comwrote:

Hi Simon,

Could you double-check? Because I'd suspect it depends on your VM's
network configuration. Unless you have NAT with port forwarding, I don't
see a situation where disconnecting the "host" network will keep the ES
default configuration happy.

Anyway, if I'm not on the right track, I'd suggest you try restarting ES
with debug logging activated and see if you get any clues from there.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Jan 29, 2013 at 11:30 AM, Simon Orr simon@exonar.com wrote:

Hi Radu,

Thanks for the suggestion but I don't think that's it. I should've
specified in my first post but the ES instance on the laptop is running
inside a VM so even when the laptop is disconnected from the network, it
has the same IP address, as provided by the VM Host's virtual adapter.

Thanks for taking the time to reply.

On 29 January 2013 08:31, Radu Gheorghe radu.gheorghe@sematext.comwrote:

Hello Simon,

I know it's kind of a long shot, but maybe what you're experiencing has
to do with the laptop having no network during restart. You can try and fix
that by adding/setting "network.host: 127.0.0.1" in your config. Then
restart Elasticsearch.

And when you connect the laptop back to the "home cluster", I guess it
will be a matter of reverting the config changes and restarting ES again.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 24, 2013 at 6:30 PM, Simon Orr simon@exonar.com wrote:

We have three nodes working together in a cluster. There are 2 indices
with 10 shards each and 2 replicas meaning each node has a full set of data.

Two of the nodes are servers, the third is an overpowered laptop. The
reason for this setup is so that the laptop can be taken off-site to
demonstrate a product and, theoretically, update with the latest data when
returned. The laptop index will only be used in a read-only way when not
connected to the other servers.

If the laptop is disconnected from the network, the index carries on
working just fine until the elasticsearch service is restarted. After a
restart, All the shards and replicas are "Unassigned". If reconnected to
the network and the service is restarted again, it recovers and starts
working again.

discovery.zen.minimum_master_nodes is set to 1

The logs include: recovered [2] indices into cluster_state

We initially used zone and rack awareness (with 1 replica) to keep a
complete copy on the laptop and half on each of the servers, however, the
load was sufficiently small that we increased replicas to 2. The problem
seems to occur whether rack/zone awareness is turned on or off.

The node/rack awareness setup we use:

node.rack: rack1
node.zone: zone1
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: rack, zone

es version 0.20.2

How can we get the single node on the laptop to continue running after
a restart when not connected to the other nodes?

--

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Cluster crashed Elasticsearch	9	478	July 6, 2017
Data loss after network disconnect Elasticsearch	2	611	July 6, 2017
Cluster turns to red after reboot Elasticsearch	29	2935	January 4, 2019
Unassigned replication shards on elasticSearch 5.2[is it a bug?] Elasticsearch	22	2632	April 22, 2019
Disappearing Data and Unassigned Shards Elasticsearch	5	859	July 6, 2017

Shards unassigned after service restart when disconnected from the network

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Best regards, Radu

Related topics

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu