Multiple masters elected during cluster crash - question about data consistency

Marek_Skorek · November 28, 2012, 8:07am

Hi,

I met on OOME on all the three nodes in my cluster.

After restarting each node from the system shell (concurrently) node 1 was
started as the first one and was elected as a master one. Right after that
the nodes 2 and 3 was started too but they could not see the 1st node had
been started yet and the 2nd node was elected as master also.
Then I saw that my single, three node cluster was splitted into two
instances (one with a single node and the second one with to 2 left nodes).
They started to work independently and two rivers was started concurrently
on both "clusters".

The question is:

If two rivers are working concurrently is there any chance that after
fixing the situation and "merge" broken clusters into a single one all the
data will be available/indexed ? Or my river need to take care of the data
consistency?

Thanks for all of your advices

Regards,
Marek.

--

Igor_Motov · November 29, 2012, 1:50am

It's possible that after "merge" you will end up with two shards that
supposed to have the same, but contain different sets of data because they
were parts of different clusters. It's somewhat of difficult situation to
recover from. The best thing you can do in this case, is to remove one of
the shards (by temporary setting number of replicas to 0, for example) and
then reindex missing records.

If you haven't done this already, I would recommend setting
discovery.zen.minimum_master_nodeshttp://www.elasticsearch.org/guide/reference/modules/discovery/zen.html to
2 (more than a half of master-eligible nodes in your cluster) in order to
prevent such situation from happening in the future.

On Wednesday, November 28, 2012 3:07:48 AM UTC-5, scoro wrote:

Hi,

I met on OOME on all the three nodes in my cluster.

After restarting each node from the system shell (concurrently) node 1 was
started as the first one and was elected as a master one. Right after that
the nodes 2 and 3 was started too but they could not see the 1st node had
been started yet and the 2nd node was elected as master also.
Then I saw that my single, three node cluster was splitted into two
instances (one with a single node and the second one with to 2 left nodes).
They started to work independently and two rivers was started concurrently
on both "clusters".

The question is:

If two rivers are working concurrently is there any chance that after
fixing the situation and "merge" broken clusters into a single one all the
data will be available/indexed ? Or my river need to take care of the data
consistency?

Thanks for all of your advices

Regards,
Marek.

--

Marek_Skorek · December 4, 2012, 3:21pm

Thank you Igor. That is the answer I was expected

W dniu czwartek, 29 listopada 2012 02:50:05 UTC+1 użytkownik Igor Motov
napisał:

It's possible that after "merge" you will end up with two shards that
supposed to have the same, but contain different sets of data because they
were parts of different clusters. It's somewhat of difficult situation to
recover from. The best thing you can do in this case, is to remove one of
the shards (by temporary setting number of replicas to 0, for example) and
then reindex missing records.

If you haven't done this already, I would recommend setting
discovery.zen.minimum_master_nodeshttp://www.elasticsearch.org/guide/reference/modules/discovery/zen.html to
2 (more than a half of master-eligible nodes in your cluster) in order to
prevent such situation from happening in the future.

On Wednesday, November 28, 2012 3:07:48 AM UTC-5, scoro wrote:

Hi,

I met on OOME on all the three nodes in my cluster.

After restarting each node from the system shell (concurrently) node 1
was started as the first one and was elected as a master one. Right after
that the nodes 2 and 3 was started too but they could not see the 1st node
had been started yet and the 2nd node was elected as master also.
Then I saw that my single, three node cluster was splitted into two
instances (one with a single node and the second one with to 2 left nodes).
They started to work independently and two rivers was started concurrently
on both "clusters".

The question is:

If two rivers are working concurrently is there any chance that after
fixing the situation and "merge" broken clusters into a single one all the
data will be available/indexed ? Or my river need to take care of the data
consistency?

Thanks for all of your advices

Regards,
Marek.

--

Topic		Replies	Views
Node data synchronize problem Elasticsearch	4	1237	January 5, 2018
Rejoin master-data node back to cluster Elasticsearch	2	1151	February 6, 2019
2 Nodes crashed, how to get last Node up an running Elasticsearch	5	239	July 29, 2022
Need advice to understand cluster behavior Elasticsearch	4	456	September 26, 2018
Nodes Out of Sync Elasticsearch	7	3507	January 5, 2018

Multiple masters elected during cluster crash - question about data consistency

Related topics