Hello again, we are continuing to test out our new elasticsearch cluster. I'm trying to understand the following:
- Why is ram usage always so high? (it doesn't ever decrease, even when nobody is searching or indexing). The only way to get ram usage back to a lower level is to reboot the server the node is on (a service restart does not suffice).
- We overwhelmed our cluster over the weekend and it caused 4 data nodes to crash (logs say heap ran out). As a result, we have some unassigned shards. Is this something we need to fix or does ES correct this for you?
- We are dividing our time-series data into monthly indices. If we don't provide a mapping, these indices are roughly 40gb-50gb in size; with a mapping it can be as much as 40% smaller in size. What is an indeal index size, and would two shards be ideal?
I realize I'm asking for a lot of information. Any help is always appreciated.
Java Heap size is about 40% of total ram per node.
Here is the output of the _cat/nodes
API.
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
xxx.xxx.xxx.xxx 67 99 0 0.00 0.02 0.05 r - odsts-coord2
xxx.xxx.xxx.xxx 46 99 0 0.01 0.03 0.05 dr - odsts-data2
xxx.xxx.xxx.xxx 6 44 0 0.00 0.02 0.05 dr - odsts-data10
xxx.xxx.xxx.xxx 27 99 0 0.00 0.01 0.05 dr - odsts-data1
xxx.xxx.xxx.xxx 60 99 0 0.07 0.03 0.05 ir - odsts-ingest1
xxx.xxx.xxx.xxx 12 30 0 0.00 0.02 0.05 mr - odsts-master3
xxx.xxx.xxx.xxx 8 61 0 0.00 0.01 0.05 ir - odsts-ingest4
xxx.xxx.xxx.xxx 11 30 0 0.00 0.01 0.05 mr - odsts-master2
xxx.xxx.xxx.xxx 49 32 0 0.00 0.01 0.05 r - odsts-coord1
xxx.xxx.xxx.xxx 7 61 0 0.00 0.01 0.05 ir - odsts-ingest3
xxx.xxx.xxx.xxx 31 99 0 0.00 0.01 0.05 dr - odsts-data3
xxx.xxx.xxx.xxx 35 99 0 0.00 0.01 0.05 dr - odsts-data4
xxx.xxx.xxx.xxx 38 46 0 0.00 0.01 0.05 ir - odsts-ingest2
xxx.xxx.xxx.xxx 24 99 0 0.01 0.03 0.05 dr - odsts-data9
xxx.xxx.xxx.xxx 17 30 0 0.02 0.02 0.05 mr * odsts-master1
xxx.xxx.xxx.xxx 7 44 0 0.01 0.03 0.05 dr - odsts-data6
xxx.xxx.xxx.xxx 7 44 0 0.00 0.01 0.05 dr - odsts-data8
xxx.xxx.xxx.xxx 43 99 0 0.02 0.02 0.05 dr - odsts-data5
xxx.xxx.xxx.xxx 4 99 0 0.00 0.01 0.05 dr - odsts-data7
Here is the output of _cluster/health
:
{
"cluster_name" : "odsts",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 10,
"active_primary_shards" : 59,
"active_shards" : 123,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 97.61904761904762
}
Here is the output of _cat/indices
. For the indices that start with ppYYYYMM
we provided a mapping and thus saw a very large size reduction. The indices that start with haoYYYYMM
have a dynamic mapping. The remaining indices can be ingnored.
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open hao201904 4denZNJ6ToqvKDXQBjsrrw 2 1 85859781 0 32.4gb 21.4gb
green open hao201905 jnitbogERT-KR96De-W0hw 2 1 76985816 0 40.8gb 20.5gb
green open hao201906 _4cDT-RLR4uaXOTZdHTEig 2 1 80081591 0 36.7gb 18.3gb
green open hao201907 sxks1YHFQVCD5GuPWiOOGg 2 1 104374341 0 44.2gb 21.9gb
green open hao201908 y1vLJv9MRK-_GA7JK8-yTw 2 1 94266038 0 41.9gb 20.9gb
green open hao201909 d1ZKXbBHQmKHvKBpcgzwyg 2 1 85904737 0 39.4gb 19.7gb
green open pp202006 ht8QQGtZQmafZyZqVuCCYg 2 1 100279225 0 26.3gb 13.1gb
green open pp202005 2wYMv3E6Rt27qIbztk0rKw 2 1 112902847 0 55.3gb 27.6gb
green open .opendistro_security 9czsdXZVTtqewMj9KadRGw 1 9 0 0 2kb 208b
green open hao202001 gVhivL8wQdaZfiR-GQn9IA 2 1 96612513 0 51.3gb 25.6gb
yellow open hao202002 bUH2ei0bQQi_jIwDqv69IA 2 1 91103663 0 40.5gb 26.9gb
green open hao202003 hkyoHpljQuObzwYLUTxrqw 2 1 99579191 0 58.8gb 29.4gb
green open hao202004 xFoSm4SkSTSyvJ0LxOoNHA 2 1 113157011 0 63.9gb 31.9gb
green open hao202005 ZYLkztXiQ0SKrsE08be02w 2 1 112902847 0 63.1gb 31.5gb
green open hao202006 scIvWRBVTgORTChqtM1IJw 2 1 100279225 0 45.4gb 22.7gb
green open hao202007 7BH8fFfiTMy9bs0IXhy0Lg 2 1 111608062 0 50.3gb 25.1gb
yellow open hao201910 CjCLQzqEQ3iwIwYRnhNVIQ 2 1 93440898 0 31.9gb 21.3gb
green open hao201911 8Edeexc1Tamga6RaK1OYBA 2 1 89519843 0 41.9gb 20.9gb
green open hao201912 o-UXOKmYSHi7r6_K4MqR7A 2 1 92361624 0 41gb 20.4gb
green open .kibana_1 u32C8sxHRiKohRb1mA9DVg 1 1 0 0 416b 208b
green open pp202002 _BWl0EmaSaOe22WqzCXVDA 2 1 91103663 0 30.2gb 15.1gb
green open proofpoint-201209 A2Yy2YwSRbmOP2UGSQyVaQ 2 1 0 0 832b 416b
green open proofpoint-201208 -F5VdXmtTz-NhY3vPDD8kg 2 1 0 0 832b 416b
green open proofpoint-201207 6ArJI3L5TveKkoMc0k73MA 2 1 0 0 832b 416b
green open proofpoint-201206 cvpXPVRYQGid_ZMJbUWqRw 2 1 0 0 832b 416b
green open proofpoint-201205 TGoFhd-qSmanvEcB2X68Jg 2 1 0 0 832b 416b
green open proofpoint-201204 w42iNOL3T0-zcegyPRJQXA 2 1 262183 0 68.4mb 34.2mb
green open hao201901 rrLcsdidQ8S-emPUD5Za1g 2 1 84561811 0 43.7gb 21.8gb
green open hao201902 PxPe14-aSrCQzBXsXFhsHA 2 1 75577153 0 39.2gb 19.6gb
green open security-auditlog-2020.08.21 ZRBtRRm7SH2LaMX1XcK_4A 1 1 0 0 416b 208b
green open hao201903 a0bVZoOmQvm44M5MTw5KQQ 2 1 87647892 0 44.4gb 22.4gb