High disk read on one node out of 3


(Pavel Prokopenko) #1

Hi!
I observe very high disk read rate on one of our elaticsearch 5.5.2 nodes. Other 2 nodes do not have such problem.
iotop tool confirms that the disk reads are actually done by the elasticsearch threads:

  TID  PRIO  USER    DISK READ>  DISK WRITE  SWAPIN      IO    COMMAND                                                                                   
 8095 be/4 elastics     40.73 G   1732.88 M  0.00 % 12.71 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 8061 be/4 elastics     40.51 G   1722.35 M  0.00 % 12.72 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 8333 be/4 elastics      2.65 G      0.00 B  0.00 %  0.54 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 8026 be/4 elastics      2.45 G      8.00 K  0.00 %  0.53 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 8236 be/4 elastics      2.38 G      8.00 K  0.00 %  0.49 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 8112 be/4 elastics      2.25 G      0.00 B  0.00 %  0.50 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 1037 be/4 root          2.18 G     66.57 M  0.00 %  1.52 % snmpd -LS0-6d -f
 8316 be/4 elastics   1224.79 M    764.70 M  0.00 %  0.22 % java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX~csearch -Edefault.path.conf=/etc/elasticsearch
 3129 be/4 mysql       919.16 M      3.42 M  0.00 %  0.06 % mysqld --wsrep_start_position=36f88a88-8d60-11e7-8399-ea05333056b4:3260959

Searching the forum for similar problem, I have found the following possible reasons:

  • Refresh

  • Merge

  • ID lookup

I have tried tried to check these areas, but still can't say I have found out anything useful.
Is there any method to find what is being so intensively read by elasticsearch?

Tried to look at the hot_threads output for this particular node.
::: {esmaas02}{z9_QEOfUTFCpsHwyagYkww}{1hOUnwP7Q1etdmcSqc9N2g}{10.212.43.47}{10.212.43.47:9300}{box_type=hot}
Hot threads at 2018-09-14T11:28:00.099Z, interval=1s, busiestThreads=3, ignoreIdleThreads=true:

    2.2% (22.4ms out of 1s) cpu usage by thread 'elasticsearch[esmaas02][bulk][T#1]'
     2/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:748)

One more observation: the reads drop down at 03:00 (the time, when the new indexes are being created). So, the problem possibly has some relation to the index size..

I have also tried to track particular indexes with primary shard located at the problem node, but no results.
So, the questions are:

  1. How to find what files are being read from disk?
  2. Why the disk reads are so high only on this one node?

Thank you!

Addition: ran more times the hot_threads and it looks like the reads are most probably caused by the id lookups:

10.0% (100.4ms out of 1s) cpu usage by thread 'elasticsearch[esmaas02][bulk][T#2]'
     10/10 snapshots sharing following 42 elements
       java.nio.Bits.copyToArray(Bits.java:834)
       java.nio.DirectByteBuffer.get(DirectByteBuffer.java:280)
       org.apache.lucene.store.ByteBufferGuard.getBytes(ByteBufferGuard.java:93)
       org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:89)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:176)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.elasticsearch.common.lucene.uid.PerThreadIDAndVersionLookup.getDocID(PerThreadIDAndVersionLookup.java:102)
       org.elasticsearch.common.lucene.uid.PerThreadIDAndVersionLookup.lookupVersion(PerThreadIDAndVersionLookup.java:88)
       org.elasticsearch.common.lucene.uid.VersionsResolver.loadDocIdAndVersion(VersionsResolver.java:122)
       org.elasticsearch.common.lucene.uid.VersionsResolver.loadVersion(VersionsResolver.java:137)
       org.elasticsearch.index.engine.InternalEngine.loadCurrentVersionFromIndex(InternalEngine.java:1374)
       org.elasticsearch.index.engine.InternalEngine.resolveDocVersion(InternalEngine.java:394)
       org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary(InternalEngine.java:571)
       org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:494)
       org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:556)
       org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:545)
       org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequestOnPrimary(TransportShardBulkAction.java:484)
       org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:143)

And here is one more graph (more detailed):

Someone said that the problem might be caused by memory fragmentation. So, here is some memory related info

[root@esmaas01]# free -h
              total        used        free      shared  buff/cache   available
Mem:            11G        8.5G        495M        583M        2.6G        2.0G
Swap:            0B          0B          0B
[root@esmaas01]# 

[root@esmaas02]# free -h
              total        used        free      shared  buff/cache   available
Mem:            11G         10G        143M        600M        1.3G        419M
Swap:            0B          0B          0B
[root@esmaas02]# 

[root@esmaas03]# free -h
              total        used        free      shared  buff/cache   available
Mem:            11G        8.4G        159M        608M        3.0G        2.1G
Swap:            0B          0B          0B
[root@esmaas03]#

(Pavel Prokopenko) #2

More observations: If I reboot the machine (esmaas02) - I do not see the problem for some time (2-4 days)..
Also, as you can see from the above post, the problem machine (esmaas02) has more memory used (and less left for the cache).
Could it be that the elasticsearch process starts massive disk reads when the cache is getting low?


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.