So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.
I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.
Digging into this a little, it seems like it's an issue specifically w/ the
Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.
It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).
I'll keep updating in case I figure out what's going on so other folks will
have an answer if they come across this.
Cheers,
Tristan -
On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:
Hi all.
So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.
I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.
Digging into this a little, it seems like it's an issue specifically w/
the Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.
It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).
I'll keep updating in case I figure out what's going on so other folks
will have an answer if they come across this.
Cheers,
Tristan -
On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:
Hi all.
So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.
I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.
I finally got a spare few minutes to see what was up, and there was a 200s+
offset on the master. Not sure why our EC2 instance was not syncing time to
correct that offset, as it's an AWS-built Ubuntu AMI.
I corrected the offset on the master node and voilà! Things are back to
normal now.
So yea... if anyone else is experiencing weird misreporting for the master
or any other node, check this is an easy thing to look at.
Mark, again, thanks.
Cheers,
- Tristan -
On Thursday, April 30, 2015 at 4:10:31 PM UTC-5, Mark Walkom wrote:
Check time is syncd between your nodes.
On 1 May 2015 at 05:23, Tristan Hammond <tristan...@gmail.com
<javascript:>> wrote:
Hmm...
Digging into this a little, it seems like it's an issue specifically w/
the Master node. If I load up Marvel on any of the other nodes, they see
everything but the Master (and the updated index data it would hold).
Cluster still reports green w/ all 5 nodes for some reason, though.
It's only when I look at Marvel on the Master node that it looks like
everything is unresponsive (yet still showing a green status and 5 nodes).
I'll keep updating in case I figure out what's going on so other folks
will have an answer if they come across this.
Cheers,
Tristan -
On Friday, April 24, 2015 at 1:26:35 PM UTC-5, Tristan Hammond wrote:
Hi all.
So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1
client and 3 data). At some point the Marvel dash began stating on all our
nodes and indices that "no report has been received for more than 2m." I'm
not sure when this seems to have started, but the cluster status is green,
the shard and index count up top is correct, and the dash is showing
correct document count, index rate and request rate on the indices.
I've Googled around a bunch and haven't been able to find someone else
having this issue. A rolling restart is something I'd considered, but since
it's powering search on our site I'd like it to be a last resort.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.