Hey guys,
We've been running a 5 node cluster for our index (5 shards, 1 replica,
evenly distributed on 5 nodes), and are running into a problem with one of
the nodes in the cluster. It is not unique to any specific node, and can
happen sporadically on any of the nodes.
One of the machines starts spiking up close to 100% CPU Load, and close to
8 OS Load (which is amusing, considering there are only 4 CPU cores on the
machine), while all the other machines operate normally way below those
figures. Naturally, this behavior is accompanied by extremely high write
times, and read times, as well.
Here's what Marvel looks like:
Here's all the information we could gather:
- Full thread dump from while this
occurred: https://gist.github.com/danielschonfeld/ff6c3744197f2c748632 - GET
_nodes/stats: https://gist.github.com/schonfeld/693c8dbf0dd57e4cff7c - GET
_nodes/hot_threads: https://gist.github.com/schonfeld/766d771d211e452a7100 - GET
_cluster/stats: https://gist.github.com/schonfeld/d5395f97e3a87745cc1f
Thoughts? insights? Any clues would be greatly appreciated.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/05b552dc-70fe-4b76-abfb-eb9db2a9dd34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.