We have been having some issues with our production cluster composed on 7 nodes. Right now the cluster is green but I see something that is making feel comfortable about its state.
"version" : {
"number" : "1.7.1",
java full version "1.8.0_51-b16"
I ran the node stats command and I noticed the "gc" collectors "old" are too high on all nodes.
According to what I read online " The old generation collection count should remain small, and have a small collection_time_in_millis"
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 14830,
"collection_time_in_millis" : 993189
},
"old" : {
"collection_count" : 2857,
"collection_time_in_millis" : 1082151
}
}
--
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 14127,
"collection_time_in_millis" : 947026
},
"old" : {
"collection_count" : 2809,
"collection_time_in_millis" : 1814255
}
}
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 27328,
"collection_time_in_millis" : 2547057
},
"old" : {
"collection_count" : 2493,
"collection_time_in_millis" : 284310
}
}
--
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 15054,
"collection_time_in_millis" : 894937
},
"old" : {
"collection_count" : 2829,
"collection_time_in_millis" : 511283
}
}
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 13068,
"collection_time_in_millis" : 1075700
},
"old" : {
"collection_count" : 2636,
"collection_time_in_millis" : 378742
}
}
--
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 13100,
"collection_time_in_millis" : 1208937
},
"old" : {
"collection_count" : 2592,
"collection_time_in_millis" : 714342
}
}
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 22154,
"collection_time_in_millis" : 1379513
},
"old" : {
"collection_count" : 2701,
"collection_time_in_millis" : 281411
}
}
The server uptime is as follow:
Node1: 1 day
Node2: 28 days
Node3: 28 days
Node4: 28 days
Node5: 28 days
Node6: 20 days
Node7 1 day
I also noticed the heap_used_percent is most of the time above 75%
"jvm" : {
"timestamp" : 1441893597402,
"uptime_in_millis" : 145190322,
"mem" : {
"heap_used_in_bytes" : 19953288096,
"heap_used_percent" : 84,
"heap_committed_in_bytes" : 23587454976,
"heap_max_in_bytes" : 23587454976,
"non_heap_used_in_bytes" : 129210576,
"non_heap_committed_in_bytes" : 132513792,
"pools" : {
"jvm" : {
"timestamp" : 1441893597241,
"uptime_in_millis" : 145200995,
"mem" : {
"heap_used_in_bytes" : 20344159872,
"heap_used_percent" : 86,
"heap_committed_in_bytes" : 23587454976,
"heap_max_in_bytes" : 23587454976,
"non_heap_used_in_bytes" : 138011320,
"non_heap_committed_in_bytes" : 140562432,
"pools" : {
"jvm" : {
"timestamp" : 1441893596704,
"uptime_in_millis" : 143036366,
"mem" : {
"heap_used_in_bytes" : 20152739200,
"heap_used_percent" : 81,
"heap_committed_in_bytes" : 24661196800,
"heap_max_in_bytes" : 24661196800,
"non_heap_used_in_bytes" : 125541880,
"non_heap_committed_in_bytes" : 128040960,
"pools" : {
This is my elasticsearch.yml setting
path.data: /var/data/elasticsearch
cluster.name: GM-RTD
node.master: true
node.name: ElasticSearch-1
http.cors.enabled: true
plugin.mandatory: cloud-aws
bootstrap.mlockall: true
cloud.aws.access_key: XXXXXX
cloud.aws.secret_key: XXXXXXX
discovery.type: ec2
discovery.zen.ping.multicast.enabled: false
discovery.ec2.groups: GM-VPC
discovery.zen.minimum_master_nodes: 1
gateway.recover_after_nodes: 1
gateway.recover_after_time: 5m
gateway.expected_nodes: 2
The heap size is set to:
ES_HEAP_SIZE=22g
sudo sysctl -a | grep vm.max_map_count
vm.max_map_count = 262144
process" : {
"refresh_interval_in_millis" : 1000,
"id" : 32383,
"max_file_descriptors" : 65535,
"mlockall" : true
}