Good evening.
ES version 6.8.8 in docker "docker.elastic.co/elasticsearch/elasticsearch:6.8.8"
heap_size: 31g
CPU Model on node 1: Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz
CPU Model on node 2: Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz
CPU Model on node 3: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
LA node1 and node2 - 1.
LA node 3 - 6.
We have a cluster consisting of 3 nodes with ssd disks, the contents of Elasticsearch.yml :
cluster.name: cluster-name
cluster.routing.allocation.disk.watermark.flood_stage: 96%
cluster.routing.allocation.disk.watermark.high: 95%
cluster.routing.allocation.disk.watermark.low: 94%
cluster.routing.allocation.node_concurrent_recoveries: 5
cluster.routing.allocation.node_initial_primaries_recoveries: 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts:
- node1
- node2
- node3
indices.fielddata.cache.size: 2g
indices.memory.index_buffer_size: 30%
indices.queries.cache.size: 1500m
indices.recovery.max_bytes_per_sec: 40mb
network.host: # different ip on each nodes (i.e 192.168.0.57, 192.168.0.93 etc)
node.name: node3
reindex.remote.whitelist:
- example1.com:9200
- example2.com:9200
- example3.com:9200
script.allowed_types: inline,stored
xpack.security.enabled: true
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.certificate: <<cert/instance.crt>>
xpack.security.transport.ssl.certificate_authorities: <<cert/ca.crt>>
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.key: <<cert/crt.key>>
xpack.security.transport.ssl.key_passphrase: <<pass>>
xpack.security.transport.ssl.verification_mode: certificate
GET _cat/indices/*?v&s=index
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .kibana_1 6rr2MgPxQPeLa8BM-6sQzA 1 1 103 0 701kb 350.5kb
green open .kibana_2 h1elhP9uSv21VYVr1pMQDQ 1 1 433 16 2.6mb 1.3mb
green open .kibana_task_manager iSgnMYnRRI2OBn0ygS1ypA 1 1 2 0 25.9kb 12.9kb
green open .reporting-2021.10.31 GiUwDe9-QHOnh0UJN_qvgQ 1 1 4 0 1.2mb 664.2kb
green open .reporting-2021.11.07 OcTOLONbQDuWYcC8FyWyXA 1 1 33 2 42.1mb 21mb
green open .security-6 YvPLEm1NQauIlGcqEl7hfA 1 1 35 3 85.5kb 42.7kb
green open .tasks 2uqrVWd1RXynMlkYqQbW6A 1 1 1 0 12.4kb 6.2kb
green open index-name GWYxb-vQTJK_wOndK5vGEQ 12 1 109647228 37229985 359.8gb 179.1gb
GET _cat/shards?v&s=index
index shard prirep state docs store ip node
.kibana_1 0 r STARTED 103 350.5kb 192.168.0.93 node2
.kibana_1 0 p STARTED 103 350.5kb 192.168.0.92 node1
.kibana_2 0 p STARTED 433 1.3mb 192.168.0.93 node2
.kibana_2 0 r STARTED 433 1.3mb 192.168.0.92 node1
.kibana_task_manager 0 p STARTED 2 12.9kb 192.168.0.92 node1
.kibana_task_manager 0 r STARTED 2 12.9kb 192.168.0.57 node3
.reporting-2021.10.31 0 p STARTED 4 664.2kb 192.168.0.93 node2
.reporting-2021.10.31 0 r STARTED 4 664.2kb 192.168.0.57 node3
.reporting-2021.11.07 0 p STARTED 33 21mb 192.168.0.93 node2
.reporting-2021.11.07 0 r STARTED 33 21mb 192.168.0.92 node1
.security-6 0 r STARTED 35 42.7kb 192.168.0.92 node1
.security-6 0 p STARTED 35 42.7kb 192.168.0.57 node3
.tasks 0 r STARTED 1 6.2kb 192.168.0.93 node2
.tasks 0 p STARTED 1 6.2kb 192.168.0.57 node3
index-name 9 r STARTED 9131426 14.2gb 192.168.0.93 node2
index-name 9 p STARTED 9131426 13.9gb 192.168.0.92 node1
index-name 1 r STARTED 9130734 16gb 192.168.0.93 node2
index-name 1 p STARTED 9130733 15.7gb 192.168.0.57 node3
index-name 2 r STARTED 9142796 14.1gb 192.168.0.93 node2
index-name 2 p STARTED 9142796 13.6gb 192.168.0.57 node3
index-name 5 p STARTED 9139233 15.5gb 192.168.0.92 node1
index-name 5 r STARTED 9139230 14.6gb 192.168.0.57 node3
index-name 11 r STARTED 9140537 16.5gb 192.168.0.93 node2
index-name 11 p STARTED 9140536 15.6gb 192.168.0.92 node1
index-name 7 p STARTED 9135450 15.9gb 192.168.0.92 node1
index-name 7 r STARTED 9135450 15.4gb 192.168.0.57 node3
index-name 3 p STARTED 9133941 14.5gb 192.168.0.93 node2
index-name 3 r STARTED 9133943 14.5gb 192.168.0.57 node3
index-name 10 r STARTED 9137125 16.1gb 192.168.0.93 node2
index-name 10 p STARTED 9137125 16.5gb 192.168.0.92 node1
index-name 4 r STARTED 9135834 13.5gb 192.168.0.92 node1
index-name 4 p STARTED 9135830 12.4gb 192.168.0.57 node3
index-name 8 r STARTED 9138516 15.1gb 192.168.0.93 node2
index-name 8 p STARTED 9138516 14.4gb 192.168.0.92 node1
index-name 6 r STARTED 9142530 15.1gb 192.168.0.92 node1
index-name 6 p STARTED 9142530 15.4gb 192.168.0.57 node3
index-name 0 p STARTED 9139134 15.2gb 192.168.0.93 node2
index-name 0 r STARTED 9139132 14.9gb 192.168.0.57 node3
I attach several screenshots showing the difference in field cache size on different nodes.
Node 1.
Node 3.
Question 1 - Why are the values not identical and how does this affect the operation of the cluster?
Also on the 3rd node, the disk load is 80%, and sometimes 99%
I/O node1:
I/O node2:
I/O node3:
Could it be related to indexes.fielddata.cache.size? Or with what?
So, question 2 - Could it be high I/O because of the low value of indexes.fielddata.cache.size on node3?
Or it's because of the weak CPU on node 3?