Hi, running...
ES 1.5.2
Java 1.8_45
Windows 2008
4 nodes of: 32 cores 128GB 5 TB SSDs (each)
ES_HEAP_SIZE configured to 30g
I just finished bulk indexing 380,000,000 records but all 4 nodes are consuming 60% heap usage and not collecting. I have your kit running on 1 node and I tried forcing GC but nothing went down.
I know disable explicit gc is turned on in the .bat files but with your kit I'm still able to force collection. I was able to collect some memory before but not anymore.
Here is what is in the logs when I force GC from yourkit...
[ES xxx 01-01 (xxxx)] [gc][young][87406][39154] duration [1.6s], collections [1]/[2.5s], total [1.6s]/[17h], memory [18.1gb]->[17.1gb]/[30gb], all_pools {[young] [1gb]->[32mb]/[0b]}{[survivor] [112mb]->[128mb]/[0b]}{[old] [16.9gb]->[16.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87489][39155] duration [1.6s], collections [1]/[1.8s], total [1.6s]/[17h], memory [18.4gb]->[17.1gb]/[30gb], all_pools {[young] [1.3gb]->[0b]/[0b]}{[survivor] [128mb]->[128mb]/[0b]}{[old] [16.9gb]->[16.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][old][87496][3] duration [43.2s], collections [1]/[44.2s], total [43.2s]/[1.3m], memory [17.2gb]->[14.9gb]/[30gb], all_pools {[young] [136mb]->[0b]/[0b]}{[survivor] [128mb]->[0b]/[0b]}{[old] [16.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87576][39156] duration [1.6s], collections [1]/[2.7s], total [1.6s]/[17h], memory [16.3gb]->[15gb]/[30gb], all_pools {[young] [1.3gb]->[0b]/[0b]}{[survivor] [0b]->[80mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87680][39157] duration [1.5s], collections [1]/[2.1s], total [1.5s]/[17h], memory [16.3gb]->[14.9gb]/[30gb], all_pools {[young] [1.3gb]->[0b]/[0b]}{[survivor] [80mb]->[32mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87770][39158] duration [1.6s], collections [1]/[1.9s], total [1.6s]/[17h], memory [16.4gb]->[14.9gb]/[30gb], all_pools {[young] [1.4gb]->[0b]/[0b]}{[survivor] [32mb]->[24mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87861][39159] duration [1.7s], collections [1]/[2.7s], total [1.7s]/[17h], memory [16.4gb]->[14.9gb]/[30gb], all_pools {[young] [1.4gb]->[0b]/[0b]}{[survivor] [24mb]->[24mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][87953][39160] duration [1.5s], collections [1]/[1.9s], total [1.5s]/[17h], memory [16.3gb]->[14.9gb]/[30gb], all_pools {[young] [1.3gb]->[0b]/[0b]}{[survivor] [24mb]->[24mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][young][88043][39161] duration [1.6s], collections [1]/[1.9s], total [1.6s]/[17h], memory [16.4gb]->[14.9gb]/[30gb], all_pools {[young] [1.4gb]->[0b]/[0b]}{[survivor] [24mb]->[32mb]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
[ES xxx 01-01 (xxxx)] [gc][old][88079][4] duration [37.9s], collections [1]/[38.1s], total [37.9s]/[2m], memory [15.5gb]->[14.9gb]/[30gb], all_pools {[young] [544mb]->[8mb]/[0b]}{[survivor] [32mb]->[0b]/[0b]}{[old] [14.9gb]->[14.9gb]/[30gb]}
As you can see I forced it twice and not much got collected...
Also using doc_values as much as possible. The field data and filter caches are minimal. About 5mb and 30mb per node respectively.
Is it the memory mapped files that are taking up the space? Right now the cluster is idle.
Does this mean I need to add more nodes to lower the memory consumption?