Heap consumption

Need help with memory consumption.

This is our cluster:

On start (about 3 or more hours) we have green line (heap memory usage). After two or more houre we can see red line (heap memory usage)
2015-10-30 17-02-58

After that node crased:
GC prints messages in logs like this.

[2015-10-28 20:37:42,134][WARN ][monitor.jvm              ] [process8] [gc][old][1312][37] duration [21s], collections [1]/[21.7s], total [21s]/[7.3m], memory [19.4gb]->[18.1gb]/[19.9gb], all_pools {[young] [231.3mb]->[114.6mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [19.1gb]->[18gb]/[19.3gb]}
[2015-10-28 20:38:07,962][WARN ][monitor.jvm              ] [process8] [gc][old][1315][38] duration [23.2s], collections [1]/[23.7s], total [23.2s]/[7.7m], memory [19.2gb]->[18gb]/[19.9gb], all_pools {[young] [8.5mb]->[16.1mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [19.1gb]->[18gb]/[19.3gb]}
[2015-10-28 20:38:34,725][WARN ][monitor.jvm              ] [process8] [gc][old][1322][39] duration [20.6s], collections [1]/[20.7s], total [20.6s]/[8m], memory [19.7gb]->[18.1gb]/[19.9gb], all_pools {[young] [479.9mb]->[60.6mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [19.2gb]->[18gb]/[19.3gb]}
[2015-10-28 20:39:00,554][WARN ][monitor.jvm              ] [process8] [gc][old][1325][40] duration [23.1s], collections [1]/[23.8s], total [23.1s]/[8.4m], memory [19.4gb]->[18.1gb]/[19.9gb], all_pools {[young] [491.6mb]->[87.7mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [18.9gb]->[18gb]/[19.3gb]}
[2015-10-28 20:39:25,743][WARN ][monitor.jvm              ] [process8] [gc][old][1333][41] duration [18s], collections [1]/[18.1s], total [18s]/[8.7m], memory [19.5gb]->[18.1gb]/[19.9gb], all_pools {[young] [461.7mb]->[49.5mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [19gb]->[18gb]/[19.3gb]}
[2015-10-28 20:39:52,107][WARN ][monitor.jvm              ] [process8] [gc][old][1343][42] duration [16.9s], collections [1]/[17.3s], total [16.9s]/[9m], memory [19.4gb]->[18.1gb]/[19.9gb], all_pools {[young] [429.8mb]->[65.4mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [18.9gb]->[18gb]/[19.3gb]}
[2015-10-28 20:40:19,758][WARN ][monitor.jvm              ] [process8] [gc][old][1349][43] duration [22.3s], collections [1]/[22.6s], total [22.3s]/[9.4m], memory [19.2gb]->[18.1gb]/[19.9gb], all_pools {[young] [265.9mb]->[87.5mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [18.9gb]->[18gb]/[19.3gb]}
[2015-10-28 20:40:44,749][WARN ][monitor.jvm              ] [process8] [gc][old][1352][44] duration [22.7s], collections [1]/[22.9s], total [22.7s]/[9.7m], memory [19.2gb]->[18gb]/[19.9gb], all_pools {[young] [284.5mb]->[5.2mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [18.9gb]->[18gb]/[19.3gb]}
[2015-10-28 21:57:08,889][INFO ][monitor.jvm              ] [process8] [gc][old][5920][279] duration [15.1s], collections [2]/[15.7s], total [15.1s]/[10.3m], memory [19.3gb]->[18.5gb]/[19.9gb], all_pools {[young] [209.6mb]->[70.2mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [19gb]->[18.4gb]/[19.3gb]}

We have some custom settings in elasticsearch.yml

bootstrap.mlockall: true
index.query.bool.max_clause_count: 100000
threadpool.bulk.queue_size: 300
http.max_content_length: 500mb
transport.tcp.compress: true
index.load_fixed_bitset_filters_eagerly: false

Here screenshot of dump analyzer (we have memory dump too 26GB )

Class Name Shallow Heap Retained Heap Percentage
org.elasticsearch.common.cache.LocalCache$LocalManualCache @ 0xdee6ad78 16 11,148,875,400 42.23%

And what do you want help with?
Because your GC is keeping below 30 seconds, so are you having other issues?

After heap is full - node down (slow response and after kicks from cluster).
Now we try to use G1 garbage collector. Seems its help

Hi,

You should check to see how much field data and segment memory are being used since you're running out of heap:

GET /_nodes/stats?human

Seems like you're getting a lot of memory pressure and java can't free up enough memory. I am not sure but LocalManualCache could be related to field data. If the above API shows most of your heap being used for field data then I'd recommend using doc values:

https://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

Otherwise as a temporary workaround until you implement doc values is to set indices.fielddata.cache.size to something like 30% or whatever value which stops the node from OOM:

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-fielddata.html#modules-fielddata