Exception caught on netty layer - java.lang.OutOfMemoryError

Hi folks,
I keep on getting a JVM out of memory error. I have used the option to disable index closing, so when this error does occur the indexes do not close.
I would appreciate any advice on this.
Thanks in advance.
Sanj

I have attached the logs and machine configuration. I am using a virtual machine.

jvm.options file:
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+AlwaysPreTouch
-Xss1m
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djna.nosys=true
-XX:-OmitStackTraceInFastThrow
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-Xmx8192m
-Xms8192m

exception caught on netty layer
java.lang.OutOfMemoryError

es 6.3.4
jvm 1.8_181
Physical Ram: 16GB
JVM: 8 GB
Windows server 2016

2018-08-21T16:47:02,511][INFO ][o.e.c.s.ClusterSettings ] [CBDES02] updating [cluster.indices.close.enable] from [true] to [false]
[2018-08-21T17:31:02,542][WARN ][monitor.jvm] [NODE1] [gc][ParNew][3246454][64605] duration [822ms], collections [1]/[4.3s], total [822ms]/[21m], memory [966.1mb]->[766.9mb]/[990.7mb], all_pools {[Code Cache] [13.1mb]->[13.1mb]/[48mb]}{[Par Eden Space] [266.2mb]->[75.6mb]/[266.2mb]}{[Par Survivor Space] [8.9mb]->[0b]/[33.2mb]}{[CMS Old Gen] [690.8mb]->[691.2mb]/[691.2mb]}{[CMS Perm Gen] [33.6mb]->[33.6mb]/[82mb]}
[2018-08-21T17:31:02,542][WARN ][transport.netty] [NODE1] exception caught on netty layer[[id: 0x03d14f1c, / 10.0.6.100:36055 => / 10.0.6.105:9300]]java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:108) at org.elasticsearch.search.controller.ScoreDocQueue.(ScoreDocQueue.java:32)[2018-08-21T17:31:02,975][WARN ][r.suppressed ] path: /_all/_close, params: {index=_all}
java.lang.IllegalStateException: closing indices is disabled - set [cluster.indices.close.enable: true] to enable it. NOTE: closed indices still consume a significant amount of diskspace
at org.elasticsearch.action.admin.indices.close.TransportCloseIndexAction.doExecute(TransportCloseIndexAction.java:87) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.action.admin.indices.close.TransportCloseIndexAction.doExecute(TransportCloseIndexAction.java:48) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1253) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.close(AbstractClient.java:1409) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.rest.action.admin.indices.RestCloseIndexAction.lambda$prepareRequest$0(RestCloseIndexAction.java:52) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:97) [elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:240) [elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:336) [elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174) [elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.http.netty4.Netty4HttpServerTransport.dispatchRequest(Netty4HttpServerTransport.java:500) [transport-netty4-6.2.4.jar:6.2.4]
at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:80) [transport-netty4-6.2.4.jar:6.2.4]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at org.elasticsearch.http.netty4.pipelining.HttpPipeliningHandler.channelRead(HttpPipeliningHandler.java:68) [transport-netty4-6.2.4.jar:6.2.4]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at

Can you provide some details of the operations you run on the cluster? What does your data model look like? Are you performing specific queries when the node runs out of memory? A common cause for OOM is a nested aggregation query on high cardinality fields.

The underlying issue for your node running out of memory is not going to be solved by setting cluster.indices.close.enable to false. See the documentation for what closing an index means.

The script to create one index is shown below.

knowledgebaseproduct1

size: 1.69Gi (1.69Gi)
docs: 3,204,295 (3,231,925)

Currently using Nest 6.3.4 to handle all the queries and insert/update.

We are doing a text search across one or many fields. We are looking for documents where the CompliFolderFolderNameCriteria contains or starts with specific words.
Sometimes a search uses two fields eg: CompliFolderFolderNameCriteria & NameNumberAliasCriteria.

The results set usually contains 3000 documents per search.
The search can be done by many users (currently only about 5 users max),

Data ingestion is done using logstash and pulls from SQL. Logstash runs daily at 6pm.
And inserts or updates to any documents are handled real time. But maybe only 5000 insert/updates a day max.

curl -XPUT http://localhost:9200/knowledgebaseproduct1 -H'Content-Type: application/json' -d '{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0,
"analysis": {
"char_filter": {
"removewhitespaces": {
"type": "pattern_replace",
"pattern": "\s+",
"replacement": ""
}
},
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 30
},
"ngram_filter": {
"type": "ngram",
"min_gram": 2,
"max_gram": 3
},
"removesymbols": {
"type": "pattern_replace",
"pattern": "[^A-Za-z0-9]",
"replacement": ""
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
},
"ngram": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"ngram_filter"
],
"char_filter": [
"removewhitespaces"
]
},
"keyvalue": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase",
"asciifolding",
"removesymbols"
],
"char_filter": [
"removewhitespaces"
]
},
"basic": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"doc": {
"dynamic": "strict",
"properties": {
"CompliFolderCompliFolderIdCriteria":{
"type": "long"
},
"CompliFolderCompanyIdCriteria": {
"type": "long"
},
"CompliFolderPhysicalStateLkpIdCriteria": {
"type": "long"
},
"CompliFolderStatusLkpIdCriteria":{
"type": "long"
},
"ObtainmentActionsNoticeNumberCriteria":{
"type": "text",
"fields": {
"autocompletefield": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"ngramfield": {
"type": "text",
"analyzer": "ngram"
},
"keyvaluefield": {
"type": "text",
"analyzer": "keyvalue"
},
"basicfield": {
"type": "text",
"analyzer": "basic"
}
}
},
"ObtainmentWorkItemContainsProductCriteria":{
"type": "long"
},
"CompliFolderFolderNameCriteria": {
"type": "text",
"fields": {
"autocompletefield": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"ngramfield": {
"type": "text",
"analyzer": "ngram"
},
"keyvaluefield": {
"type": "text",
"analyzer": "keyvalue"
},
"basicfield": {
"type": "text",
"analyzer": "basic"
}
}
},
"NameNumberNameOrNumberCriteria": {
"type": "text",
"fields": {
"autocompletefield": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"ngramfield": {
"type": "text",
"analyzer": "ngram"
},
"keyvaluefield": {
"type": "text",
"analyzer": "keyvalue"
},
"basicfield": {
"type": "text",
"analyzer": "basic"
}
}
},
"NameNumberAliasCriteria": {
"type": "text",
"fields": {
"autocompletefield": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"ngramfield": {
"type": "text",
"analyzer": "ngram"
},
"keyvaluefield": {
"type": "text",
"analyzer": "keyvalue"
},
"basicfield": {
"type": "text",
"analyzer": "basic"
}
}
}
}
}
}
}'
sleep 10

Can you show the full output of the cluster stats API?

{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "complielasticsearch",
"nodes" : {
"iU7a_H4_TMG6BFR6rITy4w" : {
"timestamp" : 1534975291774,
"name" : "CBDES02",
"transport_address" : "10.54.43.108:9300",
"host" : "CBDES02.DEV.LOCAL",
"ip" : "10.54.43.108:9300",
"roles" : [
"master",
"data",
"ingest"
],
"indices" : {
"docs" : {
"count" : 13686193,
"deleted" : 31902
},
"store" : {
"size_in_bytes" : 8281607051
},
"indexing" : {
"index_total" : 0,
"index_time_in_millis" : 0,
"index_current" : 0,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 0,
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 18,
"query_time_in_millis" : 145,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size_in_bytes" : 0,
"total" : 0,
"total_time_in_millis" : 0,
"total_docs" : 0,
"total_size_in_bytes" : 0,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 0,
"total_auto_throttle_in_bytes" : 377487360
},
"refresh" : {
"total" : 127,
"total_time_in_millis" : 31,
"listeners" : 0
},
"flush" : {
"total" : 18,
"total_time_in_millis" : 1401
},
"warmer" : {
"current" : 0,
"total" : 54,
"total_time_in_millis" : 2
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 14,
"hit_count" : 0,
"miss_count" : 14,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 269,
"memory_in_bytes" : 47067957,
"terms_memory_in_bytes" : 44304116,
"stored_fields_memory_in_bytes" : 1776536,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 300480,
"points_memory_in_bytes" : 595325,
"doc_values_memory_in_bytes" : 91500,
"index_writer_memory_in_bytes" : 0,
"version_map_memory_in_bytes" : 0,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 73,
"size_in_bytes" : 80884,
"uncommitted_operations" : 0,
"uncommitted_size_in_bytes" : 774
},
"request_cache" : {
"memory_size_in_bytes" : 5406,
"evictions" : 0,
"hit_count" : 3,
"miss_count" : 6
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"os" : {
"timestamp" : 1534975292055,
"cpu" : {
"percent" : 0
},
"mem" : {
"total_in_bytes" : 17179398144,
"free_in_bytes" : 1669337088,
"used_in_bytes" : 15510061056,
"free_percent" : 10,
"used_percent" : 90
},
"swap" : {
"total_in_bytes" : 17179398144,
"free_in_bytes" : 1941716992,
"used_in_bytes" : 15237681152
}
},
"process" : {
"timestamp" : 1534975292055,
"open_file_descriptors" : -1,
"max_file_descriptors" : -1,
"cpu" : {
"percent" : 0,
"total_in_millis" : 38828
},
"mem" : {
"total_virtual_in_bytes" : 13396226048
}
},
"jvm" : {
"timestamp" : 1534975292055,
"uptime_in_millis" : 601374,
"mem" : {
"heap_used_in_bytes" : 226267408,
"heap_used_percent" : 1,
"heap_committed_in_bytes" : 12850036736,
"heap_max_in_bytes" : 12850036736,
"non_heap_used_in_bytes" : 77749000,
"non_heap_committed_in_bytes" : 83845120,
"pools" : {
"young" : {
"used_in_bytes" : 62786472,
"max_in_bytes" : 279183360,
"peak_used_in_bytes" : 279183360,
"peak_max_in_bytes" : 279183360
},
"survivor" : {
"used_in_bytes" : 34865152,
"max_in_bytes" : 34865152,
"peak_used_in_bytes" : 34865152,
"peak_max_in_bytes" : 34865152
},
"old" : {
"used_in_bytes" : 128615784,
"max_in_bytes" : 12535988224,
"peak_used_in_bytes" : 128615784,
"peak_max_in_bytes" : 12535988224
}
}
},
"threads" : {
"count" : 43,
"peak_count" : 43
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 5,
"collection_time_in_millis" : 361
},
"old" : {
"collection_count" : 1,
"collection_time_in_millis" : 36
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 27,
"used_in_bytes" : 50704899,
"total_capacity_in_bytes" : 50704898
},
"mapped" : {
"count" : 593,
"used_in_bytes" : 8248466860,
"total_capacity_in_bytes" : 8248466860
}
},
"classes" : {
"current_loaded_count" : 10882,
"total_loaded_count" : 10882,
"total_unloaded_count" : 0
}
},

"thread_pool" : {
"bulk" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"fetch_shard_started" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 8,
"completed" : 18
},
"fetch_shard_store" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"flush" : {
"threads" : 2,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 2,
"completed" : 36
},
"force_merge" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"generic" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 197
},
"get" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"index" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"listener" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"management" : {
"threads" : 3,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 3,
"completed" : 79
},
"refresh" : {
"threads" : 2,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 2,
"completed" : 3454
},
"search" : {
"threads" : 7,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 7,
"completed" : 24
},
"snapshot" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
}
},

"fs" : {
"timestamp" : 1534975292055,
"total" : {
"total_in_bytes" : 214221975552,
"free_in_bytes" : 181595267072,
"available_in_bytes" : 181595267072
},
"least_usage_estimate" : {
"path" : "C:\ProgramData\Elastic\Elasticsearch\data\nodes\0",
"total_in_bytes" : 214221975552,
"available_in_bytes" : 181595267072,
"used_disk_percent" : 15.230327512351892
},
"most_usage_estimate" : {
"path" : "C:\ProgramData\Elastic\Elasticsearch\data\nodes\0",
"total_in_bytes" : 214221975552,
"available_in_bytes" : 181595267072,
"used_disk_percent" : 15.230327512351892
},
"data" : [
{
"path" : "C:\ProgramData\Elastic\Elasticsearch\data\nodes\0",
"mount" : "(C:)",
"type" : "NTFS",
"total_in_bytes" : 214221975552,
"free_in_bytes" : 181595267072,
"available_in_bytes" : 181595267072
}
]
},
"transport" : {
"server_open" : 0,
"rx_count" : 0,
"rx_size_in_bytes" : 0,
"tx_count" : 0,
"tx_size_in_bytes" : 0
},
"http" : {
"current_open" : 3,
"total_opened" : 6
},
"breakers" : {
"request" : {
"limit_size_in_bytes" : 7710022041,
"limit_size" : "7.1gb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.0,
"tripped" : 0
},
"fielddata" : {
"limit_size_in_bytes" : 7710022041,
"limit_size" : "7.1gb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.03,
"tripped" : 0
},
"in_flight_requests" : {
"limit_size_in_bytes" : 12850036736,
"limit_size" : "11.9gb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.0,
"tripped" : 0
},
"accounting" : {
"limit_size_in_bytes" : 12850036736,
"limit_size" : "11.9gb",
"estimated_size_in_bytes" : 47067957,
"estimated_size" : "44.8mb",
"overhead" : 1.0,
"tripped" : 0
},
"parent" : {
"limit_size_in_bytes" : 8995025715,
"limit_size" : "8.3gb",
"estimated_size_in_bytes" : 47067957,
"estimated_size" : "44.8mb",
"overhead" : 1.0,
"tripped" : 0
}
},
"script" : {
"compilations" : 0,
"cache_evictions" : 0
},
"discovery" : {
"cluster_state_queue" : {
"total" : 0,
"pending" : 0,
"committed" : 0
},
"published_cluster_states" : {
"full_states" : 0,
"incompatible_diffs" : 0,
"compatible_diffs" : 0
}
},
"ingest" : {
"total" : {
"count" : 0,
"time_in_millis" : 0,
"current" : 0,
"failed" : 0
},
"pipelines" : { }
},
"adaptive_selection" : {
"iU7a_H4_TMG6BFR6rITy4w" : {
"outgoing_searches" : 0,
"avg_queue_size" : 0,
"avg_service_time_ns" : 2698838,
"avg_response_time_ns" : 3729506,
"rank" : "3.7"
}
}
}
}
}

I found out what was causing this. Kibana was accessing this node. When kibana stopped connecting to this node the issues stopped.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.