If anyone can help me understand why my cluster is hung I would appreciate
it.
jstack output:
I am able to query the cluster and health is good but I can't DELETE or
CLOSE index as it is unresponsive.
mlockall is set to true
iostat:
avg-cpu: %user %nice %system %iowait %steal %idle
2.00 0.05 0.30 0.08 0.00 97.57
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 7.40 0.00 939.20 0 4696
sda 0.40 0.00 4.80 0 24
dm-0 0.60 0.00 4.80 0 24
dm-1 0.00 0.00 0.00 0 0
dm-2 117.40 0.00 939.20 0 4696
avg-cpu: %user %nice %system %iowait %steal %idle
2.93 0.03 0.23 0.08 0.00 96.74
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 6.80 0.00 776.00 0 3880
sda 0.80 0.00 20.80 0 104
dm-0 2.60 0.00 20.80 0 104
dm-1 0.00 0.00 0.00 0 0
dm-2 97.00 0.00 776.00 0 3880
avg-cpu: %user %nice %system %iowait %steal %idle
1.20 0.03 0.25 0.10 0.00 98.42
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 11.40 0.00 1312.00 0 6560
sda 0.80 0.00 22.40 0 112
dm-0 2.80 0.00 22.40 0 112
dm-1 0.00 0.00 0.00 0 0
dm-2 164.00 0.00 1312.00 0 6560
avg-cpu: %user %nice %system %iowait %steal %idle
7.07 0.03 0.50 0.08 0.00 92.33
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 20.40 0.00 5064.00 0 25320
sda 1.00 0.00 25.60 0 128
dm-0 3.20 0.00 25.60 0 128
dm-1 0.00 0.00 0.00 0 0
dm-2 633.00 0.00 5064.00 0 25320
avg-cpu: %user %nice %system %iowait %steal %idle
1.23 0.05 0.33 0.10 0.00 98.30
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sdb 15.20 0.00 2604.80 0 13024
sda 2.40 0.00 38.40 0 192
dm-0 4.80 0.00 38.40 0 192
dm-1 0.00 0.00 0.00 0 0
dm-2 325.60 0.00 2604.80 0 13024
vmstat:
-bash-4.1$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id
wa st
0 0 0 141532 163140 1955776 0 0 19 80 2 0 2 0 96
2 0
0 0 0 140664 163156 1956428 0 0 0 801 776 719 3 0 97
0 0
0 0 0 138880 163164 1958264 0 0 0 776 770 765 2 0 98
0 0
0 0 0 133820 163192 1963364 0 0 0 1570 1174 825 4 0 95
0 0
1 0 0 129984 163200 1967036 0 0 0 1422 1026 836 4 0 95
0 0
-bash-4.1$ lsof -u elasticsearch | wc -l
3004
/etc/security/limits.conf:elasticsearch hard nofile 65536
/etc/security/limits.conf:elasticsearch soft nofile 65536
/etc/security/limits.conf:elasticsearch - memlock unlimited
top - 18:15:25 up 18 days, 14:36, 1 user, load average: 0.23, 0.32, 0.32
Tasks: 190 total, 1 running, 189 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 8060812k total, 7928472k used, 132340k free, 164384k buffers
Swap: 0k total, 0k used, 0k free, 1963024k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26117 elastics 20 0 55.0g 5.2g 327m S 4.3 68.1 1836:21 java
1358 logstash 39 19 5078m 257m 11m S 0.7 3.3 183:28.43 java
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aee7cbd8-da2d-47b5-bf82-22ef1f1805b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.