High iowait on elasticsearch nodes

I need help with the following. We have a cluster of 7 machine with about 300.000.000 documents every day. At some point during the day, not constantly the iowait on 1 or 2 nodes on cluster jumps to 60% and we start to get delays in processing the record. It happens randomly on every nodes. Each node has 8Tb of EBS SSD disks within LVM.

Below is the iostat output and ES configuration.

I will appreciate any help, since currently I am in dark waters and can't understand what happens.

IOSTAT

avg-cpu: %user %nice %system %iowait %steal %idle
8.05 0.00 2.55 63.09 0.13 26.18

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvdap1 0.00 15.00 25.00 6.00 424.00 184.00 19.61 0.00 0.13 0.13 0.40
xvdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdf 141.00 0.00 1100.00 0.00 29024.00 0.00 26.39 6.48 5.71 0.77 84.80
xvdg 160.00 0.00 1157.00 0.00 30304.00 0.00 26.19 6.90 5.87 0.81 93.20
xvdh 154.00 0.00 1154.00 0.00 29840.00 0.00 25.86 6.77 5.77 0.76 87.20
xvdi 159.00 0.00 1145.00 0.00 29880.00 0.00 26.10 6.32 5.48 0.74 85.20
xvdj 154.00 0.00 1135.00 0.00 29448.00 0.00 25.95 7.16 6.19 0.73 82.40
xvdk 151.00 0.00 1103.00 0.00 28928.00 0.00 26.23 5.91 5.28 0.77 85.20
xvdl 150.00 0.00 1058.00 0.00 28912.00 0.00 27.33 4.36 4.00 0.81 85.20
xvdm 150.00 0.00 1112.00 0.00 29424.00 0.00 26.46 5.92 5.19 0.83 92.80
dm-1 0.00 0.00 10171.00 0.00 237216.00 0.00 23.32 67.92 6.54 0.10 100.00

Node configuration :smile:
node.master: false
node.data: true
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
network.host: eth0:ipv4
path.conf: /etc/elasticsearch
path.data: /ebs/elasticsearch
path.logs: /data/logs/elasticsearch
path.plugins: /usr/share/elasticsearch/plugins
indices.memory.index_buffer_size: 50%
index.translog.flush_threshold_ops: 50000
index.store.type: mmapfs
index.refresh_interval: 10s
threadpool.search.type: fixed
threadpool.search.size: 100
threadpool.search.queue_size: 200
threadpool.index.type: fixed
threadpool.index.size: 30
threadpool.index.queue_size: 1000
indices.fielddata.cache.size: 25%
indices.cluster.send_refresh_mapping: false
threadpool.bulk.queue_size: 3000
index.number_of_replicas: 1
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 500ms

index.search.slowlog.threshold.fetch.warn: 1s
index.search.slowlog.threshold.fetch.info: 800ms
index.search.slowlog.threshold.fetch.debug: 500ms
index.search.slowlog.threshold.fetch.trace: 200ms

index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s
index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.trace: 500ms

It's not best practise to set threadpools like that. ES manages these (to a degree) dynamically and hard setting them like this can cause more problems.

Are you using the noop scheduler? If not I'd try that to start.
Are merges happening at that time?

1 Like

No, we're not using noop scheduler.

Problem with high IO wait that it start to happen at some point on some node with no external causes. We stopped search nodes for some time to eliminate the possibility of bad query, but the high IO wait continued to happen, until it suddenly disappear in couple of hours. What I can see in HQ plugin, is that our merge rate is 12.3 Mb/sec.

Also, I looked at the documentation and didn't find any reference how threadpools definition may impact IO rate.

thank you
Igor