Marvel showing unresponsive nodes but active data


(Tristan Hammond) #1

Hi all.

So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.

I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.

Any insight would be much appreciated.

Cheers,
- Tristan -

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/63760db3-2913-4f18-9442-9599187cbea3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Tristan Hammond) #2

Hmm...

Digging into this a little, it seems like it's an issue specifically w/ the
Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.

It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).

I'll keep updating in case I figure out what's going on so other folks will
have an answer if they come across this.

Cheers,

  • Tristan -

On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:

Hi all.

So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.

I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.

Any insight would be much appreciated.

Cheers,
- Tristan -

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a2a44e4b-c40d-47b7-b822-6a9454cc3e95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #3

Check time is syncd between your nodes.

On 1 May 2015 at 05:23, Tristan Hammond tristan.hammond@gmail.com wrote:

Hmm...

Digging into this a little, it seems like it's an issue specifically w/
the Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.

It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).

I'll keep updating in case I figure out what's going on so other folks
will have an answer if they come across this.

Cheers,

  • Tristan -

On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:

Hi all.

So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.

I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.

Any insight would be much appreciated.

Cheers,
- Tristan -

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a2a44e4b-c40d-47b7-b822-6a9454cc3e95%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a2a44e4b-c40d-47b7-b822-6a9454cc3e95%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9nrLHp1xanxN0m0Lhoc0MtjcoQNFNDXna2AYxB617pPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Tristan Hammond) #4

Mark, you're a lovely man.

I finally got a spare few minutes to see what was up, and there was a 200s+
offset on the master. Not sure why our EC2 instance was not syncing time to
correct that offset, as it's an AWS-built Ubuntu AMI.

I corrected the offset on the master node and voilĂ ! Things are back to
normal now.

So yea... if anyone else is experiencing weird misreporting for the master
or any other node, check this is an easy thing to look at.

Mark, again, thanks.

Cheers,
- Tristan -

On Thursday, April 30, 2015 at 4:10:31 PM UTC-5, Mark Walkom wrote:

Check time is syncd between your nodes.

On 1 May 2015 at 05:23, Tristan Hammond <tristan...@gmail.com
<javascript:>> wrote:

Hmm...

Digging into this a little, it seems like it's an issue specifically w/
the Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.

It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).

I'll keep updating in case I figure out what's going on so other folks
will have an answer if they come across this.

Cheers,

  • Tristan -

On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:

Hi all.

So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.

I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.

Any insight would be much appreciated.

Cheers,
- Tristan -

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a2a44e4b-c40d-47b7-b822-6a9454cc3e95%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a2a44e4b-c40d-47b7-b822-6a9454cc3e95%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2a492903-cb12-4c48-8bdb-1999920c81b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5