Hi List,
I am seeing pretty long times on some of my _cat queries and wanted to run them by you to see if this is expected:
curl 'localhost:9200/_cat/nodes'
real 3m42.792s
user 0m0.024s
sys 0m0.000s
curl 'localhost:9200/_cat/indices'
real 0m16.399s
user 0m0.004s
sys 0m0.004s
These are run on two 'client' nodes that are not servicing any queries currently. Is this expected? I took out a data node about 8 hours ago because it was constantly resulting in messages like these on the master and topbeat showed it was quite overloaded for the past several days:
[2016-01-07 23:58:33,202][WARN ][transport ] [bxb-sln-vm97] Received response for a request that has timed out, sent [161674ms] ago, timed out [146674ms] ago, action [cluster:monitor/nodes/stats[n]], node [{bxb-sln-srv-4}{8nI6Rm7vT5-vlit27tAwDA}{10.86.205.57}{10.86.205.57:9300}{master=false}], id [4241130]
[2016-01-07 23:59:05,559][WARN ][transport ] [bxb-sln-vm97] Received response for a request that has timed out, sent [134029ms] ago, timed out [119029ms] ago, action [cluster:monitor/nodes/stats[n]], node [{bxb-sln-srv-4}{8nI6Rm7vT5-vlit27tAwDA}{10.86.205.57}{10.86.205.57:9300}{master=false}], id [4241904]
[2016-01-07 23:59:06,532][DEBUG][action.admin.cluster.node.stats] [bxb-sln-vm97] failed to execute on node [8nI6Rm7vT5-vlit27tAwDA]
ReceiveTimeoutTransportException[[bxb-sln-srv-4][10.86.205.57:9300][cluster:monitor/nodes/stats[n]] request_id [4243442] timed out after [15000ms]]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:645)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The indices are still yellow so perhaps shards and replicas are being moved around after I took that data node offline.
Another fundamental question: my cluster consists of heterogenous machines:
- data nodes: several blades with 8G RAM with spinning disks
- data nodes: several VMs wtih 16G RAM and spinning disks
- Masters on VMs with 8G RAM and spinning disks
- Client Nodes: VMs with 32G RAM and spinning disks.
Being flexible, I dont think ES should suffer from heterogeneity, but still wanted to run this by the list to get some thoughts on anything to keep a look out for or optimize.
Thanks