Cluster down after an autoreboot?

Nicholas_Thompson · February 7, 2018, 5:14pm

Hi,

We got an email at 1418 saying the cluster had been automatically rebooted due to high memory usage.

We can now see 2 instances in the cluster (there was 1 before) and one of them (001) just keeps rebooting itself. The other (002) seems to be complaining in the logs about being unable to find a master node?

We've raised a support ticket but the SLA says up to 3 business days.

Anyone got any suggestions on what we can do in the meantime?

Nick

Nicholas_Thompson · February 7, 2018, 5:33pm

Seems to have magically come back up on its own? Unless the support team have claimed the ticket?

We still have a mystery second instance...

Is 278 unassigned shards bad?

Christian_Dahlqvist · February 7, 2018, 6:05pm

How large are the instances? How many shards do you have in total?

Nicholas_Thompson · February 7, 2018, 10:35pm

It's running a single node, 2Gb RAM instance. According to _cat/health, there are 279 shards and 278 unassigned?!

In terms of data, I thought we had about 4 indexes (total of less than 1.5m rows and ~1Gb data)... however _cat/indices shows another 260 indices for .watcher-history... dating back to May 2017 (and a few small kibana tables). Would these cause any issues?!

Checking the cluster again this evening and we're back down to a single instance running at 60% mem pressure (although the graph indicates it frequently hits the 70% limit and does some kind of GC?)

Christian_Dahlqvist · February 8, 2018, 6:21am

If you have replicas configured for any of the indices, Elasticsearch will not be able to place these as you only have 1 node. That is nothing to worry about.

Having all those .watcher-history... indices around will use up resources, so I would recommend deleting older ones, e.g. all the ones from last year or the ones older than a month ago. This will help with reducing heap pressure.

system · March 8, 2018, 6:21am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster suddenly blew up out of nowhere Elasticsearch	6	407	January 21, 2019
I have issues with my cluster and I want to re-build it. Need advices! Elasticsearch	5	442	October 4, 2019
A few general questions about Elasticsearch Elasticsearch	14	865	April 6, 2018
First steps troubleshooting ES cluster crashes? Elasticsearch	9	3536	March 3, 2018
Performance degrading after a couple of weeks Elasticsearch	7	520	October 30, 2018

Cluster down after an autoreboot?

Related topics