Nodes randomly not getting latest cluster state

We have run into this weird problem in our cluster deployment where at random nodes don't recieve the latest cluster state, and when this happens, they just lie there without actually doing anything with the cluster, this fails the scenarios where we are either indexing or searching from the shards on the node. The cluster (other nodes or master) still report the state is healthy, but the only mitigation to let the things moving on the node, which we have found is to recycle the elastic search service on the node.
So far there has been no co-relation between any activity and this problem, any hints would be helpful.

Checked your logs?

What ES and java version, how many nodes (data and master) and how much
data do you store (doc, size and index counts).

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 25 September 2014 00:09, srijan55 srijan55@gmail.com wrote:

We have run into this weird problem in our cluster deployment where at
random
nodes don't recieve the latest cluster state, and when this happens, they
just lie there without actually doing anything with the cluster, this fails
the scenarios where we are either indexing or searching from the shards on
the node. The cluster (other nodes or master) still report the state is
healthy, but the only mitigation to let the things moving on the node,
which
we have found is to recycle the elastic search service on the node.
So far there has been no co-relation between any activity and this problem,
any hints would be helpful.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nodes-randomly-not-getting-latest-cluster-state-tp4063966.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1411567793551-4063966.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624byvKOqe33R1U8PH%3DSo-EJRnCkHgFv2ChzdTWg64LoyWQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Most of the times the logs around the problem are related to try to index a document which doesn't conform to the mappings (We expect this can happen some time). We are running a big cluster with 50 data nodes and 3 masters. Data is about a TB. Indexes are 5 indexes per day, with approximately 100k documents in the biggest indexes.

Forgot to mention ES - 1.3.2 Java-7