Hi,
I have upgraded ELK from version 6.6.1 to 7.0.1 in kubernetes environment.There are 11 nodes in the cluster - 3 master pods, 3 data pods & 5 client pods. I see memory consumption of elasticsearch is increasing indefinitely.
JVM configured for each of the pods:
master: -Xms1g -Xmx1g
client: Xms16g -Xmx16g
data: -Xms12g -Xmx12g
Cluster health
$ curl -k 'https://elasticsearch.default.svc.cluster.local:9200/_cluster/health?pretty' -uadmin:admin
{
"cluster_name" : "elk-efkc",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 11,
"number_of_data_nodes" : 3,
"active_primary_shards" : 307,
"active_shards" : 556,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 58,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 90.55374592833876
}
Cluster Nodes
$ curl -k 'https://elasticsearch.default.svc.cluster.local:9200/_cat/nodes?v' -uadmin:admin
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.116.176 70 99 15 3.03 4.30 4.79 di - elk-efkc-elk-elasticsearch-data-2
192.168.126.80 72 87 9 1.12 1.07 1.19 i - elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-kklsp
192.168.30.247 14 93 21 4.56 5.36 5.56 mi - elk-efkc-elk-elasticsearch-master-1
192.168.30.253 79 93 21 4.28 5.29 5.54 di - elk-efkc-elk-elasticsearch-data-1
192.168.225.151 75 93 16 3.99 4.50 4.90 di - elk-efkc-elk-elasticsearch-data-0
192.168.147.85 17 23 6 1.57 1.55 1.28 mi - elk-efkc-elk-elasticsearch-master-0
192.168.27.218 39 68 10 2.12 2.15 2.04 mi * elk-efkc-elk-elasticsearch-master-2
192.168.27.217 73 68 10 2.12 2.15 2.04 i - elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-db9nd
192.168.147.51 67 56 15 2.28 2.43 2.48 i - elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-wz5kn
192.168.7.143 71 59 13 1.19 1.69 1.88 i - elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-7vkdf
192.168.250.158 70 67 8 1.04 1.05 1.16 i - elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-v445g
Memory utilization of pods
$ kubectl top pods
NAME CPU(cores) MEMORY(bytes)
elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-7vkdf 1140m 21131Mi
elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-db9nd 759m 23241Mi
elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-kklsp 489m 24213Mi
elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-v445g 353m 22417Mi
elk-efkc-elk-elasticsearch-client-5d4c8b9f8f-wz5kn 1436m 18741Mi
elk-efkc-elk-elasticsearch-data-0 1804m 28096Mi
elk-efkc-elk-elasticsearch-data-1 1564m 28941Mi
elk-efkc-elk-elasticsearch-data-2 1810m 29922Mi
elk-efkc-elk-elasticsearch-exporter-768fb678b9-tvtm9 2m 53Mi
elk-efkc-elk-elasticsearch-master-0 4m 1328Mi
elk-efkc-elk-elasticsearch-master-1 4m 1318Mi
elk-efkc-elk-elasticsearch-master-2 15m 1371Mi
Cluster allocation
curl -k 'https://elasticsearch.paas.svc.cluster.local:9200/_cat/allocation?v' -uadmin:admin
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
192 45.5gb 70.5gb 129.3gb 199.9gb 35 192.168.225.151 192.168.225.151 elk-efkc-elk-elasticsearch-data-0
187 48.1gb 64.3gb 135.5gb 199.9gb 32 192.168.30.253 192.168.30.253 elk-efkc-elk-elasticsearch-data-1
183 21.4gb 43.6gb 156.2gb 199.9gb 21 192.168.116.176 192.168.116.176 elk-efkc-elk-elasticsearch-data-2
58 UNASSIGNED
I am facing this issue only after upgrade to ELK 7.0.1 where the pod's memory is increasing even upto 30gb. I see such CircuitBreakerException errors in data pods -
log":"[[raghu-impact-log-2019.09.03][0]] failed to perform indices:data/write/bulk[s] on replica [raghu-impact-log-2019.09.03][0], node[8aeYmmTUSdKTt6FVY7_Lew], [R], s[STARTED], a[id=A8L0r5ugQWyDgK1kYQjgrw]"}
org.elasticsearch.transport.RemoteTransportException: [elk-efkc-elk-elasticsearch-data-2][192.168.116.176:9300][indices:data/write/bulk[s][r]]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [16544981698/15.4gb], which is larger than the limit of [16304314777/15.1gb], real usage: [16544880096/15.4gb], new bytes reserved: [101602/99.2kb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:343) ~[elasticsearch-7.0.1.jar:7.0.1]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.0.1.jar:7.0.1]
at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1026) [elasticsearch-7.0.1.jar:7.0.1]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:922) [elasticsearch-7.0.1.jar:7.0.1]
There are 58 unassigned shards - the explaination for unassigned shard also shows the same CircuitBreakerException.
This is my GC configuration:
## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
What could be the reason of so much memory being utilized by elasticsearch? and what would be the way to control this?
Thanks,
Shivani