We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
When I try to do a Get on an existing document, I alternately get it
immediately or just wait for repsonse for ever.
My number or replicas is 2. My guess is that one of the replicas is
unavailable, thus when the query goes to that replica, it hangs.
Whereas next time when the query goes to the other (good) replica, it
returns immediately.
Whereas for post it needs to write to both the replicas and thus it
hangs.
We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
When I try to do a Get on an existing document, I alternately get it
immediately or just wait for repsonse for ever.
My number or replicas is 2. My guess is that one of the replicas is
unavailable, thus when the query goes to that replica, it hangs.
Whereas next time when the query goes to the other (good) replica, it
returns immediately.
Whereas for post it needs to write to both the replicas and thus it
hangs.
We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
Looked at the logs, there is nothing I could interpret.
Btw, both the failing (initializing) shard were on the same node. I
shut down that node and it recovered. So the good news is that the
cluster is up and running now and so far all the data seems to be
there.
When I try to do a Get on an existing document, I alternately get it
immediately or just wait for repsonse for ever.
My number or replicas is 2. My guess is that one of the replicas is
unavailable, thus when the query goes to that replica, it hangs.
Whereas next time when the query goes to the other (good) replica, it
returns immediately.
Whereas for post it needs to write to both the replicas and thus it
hangs.
We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
Looked at the logs, there is nothing I could interpret.
Btw, both the failing (initializing) shard were on the same node. I
shut down that node and it recovered. So the good news is that the
cluster is up and running now and so far all the data seems to be
there.
When I try to do a Get on an existing document, I alternately get it
immediately or just wait for repsonse for ever.
My number or replicas is 2. My guess is that one of the replicas is
unavailable, thus when the query goes to that replica, it hangs.
Whereas next time when the query goes to the other (good) replica, it
returns immediately.
Whereas for post it needs to write to both the replicas and thus it
hangs.
We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
Looked at the logs, there is nothing I could interpret.
Btw, both the failing (initializing) shard were on the same node. I
shut down that node and it recovered. So the good news is that the
cluster is up and running now and so far all the data seems to be
there.
When I try to do a Get on an existing document, I alternately get it
immediately or just wait for repsonse for ever.
My number or replicas is 2. My guess is that one of the replicas is
unavailable, thus when the query goes to that replica, it hangs.
Whereas next time when the query goes to the other (good) replica, it
returns immediately.
Whereas for post it needs to write to both the replicas and thus it
hangs.
We had a 5 node cluster (5 data nodes and 1 non-date master). One of
the data nodes went down. Now when we try to post anything to the
cluster it just hangs for ever. There is nothing in the logs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.