I am running two nodes with logs about 200 gigabytes on my backup database. My nodes were returning OutOfMemoryError after processing few queries, so I have set up bootstrap.mlockall to true, modified etc/security/limits.conf, exported ES_HEAP_SIZE=15g and restarted both nodes (I have 64gb ram). Now my nodes have 30gb memory space, all locked and all prepared just for them.
and they STILL return OutOfMemoryError after few queries.
I am using elasticsearch 2.3.3 with head plugin and javascript plugin, kibana with sense plugin. I can see that before they return OutOfMemoryError, it tries to collect garbage multiple times. Here's part of my logs
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365) at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75) at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:300) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ... 3 more [2016-08-05 09:51:26,114][WARN ][monitor.jvm ] [Scarlet Witch] [gc][young][597][25] duration [1s], collections [1]/[1.2s], total [1s]/[7s], memory [9.4gb]->[10.1gb]/[14.9gb], all_pools {[young] [95mb]->[14.4mb]/[665.6mb]}{[survivor] [83.1mb]->[83.1mb]/[83.1mb]}{[old] [9.2gb]->[10gb]/[14.1gb]} [2016-08-05 09:51:49,380][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][604][1] duration [16.8s], collections [1]/[16.8s], total [16.8s]/[16.8s], memory [14.6gb]->[10.5gb]/[14.9gb], all_pools {[young] [661mb]->[8.8mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old] [13.9gb]->[10.4gb]/[14.1gb]} [2016-08-05 09:52:26,842][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][613][2] duration [27.5s], collections [1]/[27.6s], total [27.5s]/[44.4s], memory [14.5gb]->[13.8gb]/[14.9gb], all_pools {[young] [632.8mb]->[12.4mb]/[665.6mb]}{[survivor] [83.1mb]->[0b]/[83.1mb]}{[old] [13.8gb]->[13.8gb]/[14.1gb]} [2016-08-05 09:52:53,890][WARN ][monitor.jvm ] [Scarlet Witch] [gc][young][614][41] duration [1.7s], collections [1]/[27s], total [1.7s]/[15.6s], memory [13.8gb]->[14.2gb]/[14.9gb], all_pools {[young] [12.4mb]->[99.4mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old] [13.8gb]->[14.1gb]/[14.1gb]} [2016-08-05 09:52:53,890][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][614][3] duration [24.5s], collections [1]/[27s], total [24.5s]/[1.1m], memory [13.8gb]->[14.2gb]/[14.9gb], all_pools {[young] [12.4mb]->[99.4mb]/[665.6mb]}{[survivor] [0b]->[0b]/[83.1mb]}{[old] [13.8gb]->[14.1gb]/[14.1gb]} [2016-08-05 09:53:32,837][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][616][4] duration [37.7s], collections [1]/[37.9s], total [37.7s]/[1.7m], memory [14.8gb]->[14.7gb]/[14.9gb], all_pools {[young] [665.6mb]->[543.8mb]/[665.6mb]}{[survivor] [40.9mb]->[0b]/[83.1mb]}{[old] [14.1gb]->[14.1gb]/[14.1gb]} [2016-08-05 09:54:00,390][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][617][5] duration [27.3s], collections [1]/[27.1s], total [27.3s]/[2.2m], memory [14.7gb]->[14.8gb]/[14.9gb], all_pools {[young] [543.8mb]->[665.6mb]/[665.6mb]}{[survivor] [0b]->[66.3mb]/[83.1mb]}{[old] [14.1gb]->[14.1gb]/[14.1gb]} [2016-08-05 09:54:36,604][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][618][6] duration [36.1s], collections [1]/[36.6s], total [36.1s]/[2.8m], memory [14.8gb]->[14.9gb]/[14.9gb], all_pools {[young] [665.6mb]->[665.6mb]/[665.6mb]}{[survivor] [66.3mb]->[80.4mb]/[83.1mb]}{[old] [14.1gb]->[14.1gb]/[14.1gb]} java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid7907.hprof ... Heap dump file created [21361113086 bytes in 108.208 secs] Exception in thread "elasticsearch[Scarlet Witch][transport_client_worker][T#8]" java.lang.OutOfMemoryError: Java heap space[2016-08-05 09:57:55,528][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop. [2016-08-05 09:58:42,809][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop. [2016-08-05 09:58:43,211][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][619][11] duration [2.3m], collections [5]/[4.1m], total [2.3m]/[5.1m], memory [14.9gb]->[14.9gb]/[14.9gb], all_pools {[young] [665.6mb]->[665.6mb]/[665.6mb]}{[survivor] [80.4mb]->[83mb]/[83.1mb]}{[old] [14.1gb]->[14.1gb]/[14.1gb]}
note that I have sent some malformed get requests (I forgot to write scripts for bucket_script aggregation) with sense. I think now the nodes are broken and access to localhost:9200/head will result in timeout. I want my nodes to not break after sending queries, so that I don't have to reboot my device again and again to test my queries. What should I do?