Fatal error in network and heap

Amin1 · February 4, 2018, 7:20am

Hello
I have following config :
2 client node
3 master
3 data+ingest node
my cluster work only with 1 data node and 2 of them get following error :

 [2018-02-03T18:22:54,183][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-02] [gc][1984] overhead, spent [1.4m] collecting in the last [1.4m]
    [2018-02-03T18:33:50,485][INFO ][o.e.m.j.JvmGcMonitorService] [es-data-02] [gc][old][1985][422] duration [10.9m], collections [76]/[10.9m], total [10.9m]/[52.4m], memory [3.8gb]->[3.8gb]/[3.8gb], all_pools {[young] [865.3mb]->[865.3mb]/[865.3mb]}{[survivor] [108mb]->[107.8mb]/[108.1mb]}{[old] [2.9gb]->[2.9gb]/[2.9gb]}
    [2018-02-03T18:33:50,486][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-02] [gc][1985] overhead, spent [10.9m] collecting in the last [10.9m]
    [2018-02-03T18:34:38,367][ERROR][o.e.t.n.Netty4Utils      ] fatal error on the network layer
            at org.elasticsearch.transport.netty4.Netty4Utils.maybeDie(Netty4Utils.java:185)
            at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.exceptionCaught(Netty4MessageChannelHandler.java:83)
            at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
            at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264)
            at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256)
            at 
            at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:104)
            at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:145)
            at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
            at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544)
            at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
            at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
            at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
            at java.lang.Thread.run(Thread.java:745)
    [2018-02-03T18:42:31,063][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-data-02] fatal error in thread [elasticsearch[es-data-02][search][T#14]], exiting
    java.lang.OutOfMemoryError: Java heap space
    [2018-02-03T18:42:31,061][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-data-02] fatal error in thread [elasticsearch[es-data-02][generic][T#6]], exiting
    java.lang.OutOfMemoryError: Java heap space

    [2018-02-03T18:33:59,924][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-data-02] fatal error in thread [elasticsearch[es-data-02][management][T#3]], exiting
    java.lang.OutOfMemoryError: Java heap space
            at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300) ~[?:1.8.0_111]
            at java.lang.StringCoding.encode(StringCoding.java:344) ~[?:1.8.0_111]
            at java.lang.String.getBytes(String.java:918) ~[?:1.8.0_111]
            at java.io.UnixFileSystem.canonicalize0(Native Method) ~[?:1.8.0_111]
            at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172) ~[?:1.8.0_111]
            at java.io.File.getCanonicalPath(File.java:618) ~[?:1.8.0_111]
            at java.io.FilePermission$1.run(FilePermission.java:215) ~[?:1.8.0_111]
            at java.io.FilePermission$1.run(FilePermission.java:203) ~[?:1.8.0_111]
            at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_111]
            at java.io.FilePermission.init(FilePermission.java:203) ~[?:1.8.0_111]
            at java.io.FilePermission.<init>(FilePermission.java:277) ~[?:1.8.0_111]
            at java.lang.SecurityManager.checkRead(SecurityManager.java:888) ~[?:1.8.0_111]
            at sun.nio.fs.UnixPath.checkRead(UnixPath.java:795) ~[?:?]
            at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49) ~[?:?]
            at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144) ~[?:?]
            at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99) ~[?:?]
            at java.nio.file.Files.readAttributes(Files.java:1737) ~[?:1.8.0_111]
            at java.nio.file.Files.size(Files.java:2332) ~[?:1.8.0_111]
            at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
            at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:67) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
            at org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:67) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
            at org.elasticsearch.index.store.Store$StoreStatsCache.estimateSize(Store.java:1402) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1391) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1378) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.index.store.Store.stats(Store.java:332) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:703) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:177) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:163) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:47) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:433) ~[elasticsearch-5.6.3.jar:5.6.3]
            at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:412) ~[elasticsearch-5.6.3.jar:5.6.3]

would you please help me to solve this issue
best

Christian_Dahlqvist · February 4, 2018, 7:44am

It looks like you are suffering from insufficient heap space. What is the full output of the cluster stats API?

Amin1 · February 4, 2018, 8:00am

Hello
{"_nodes":{"total":7,"successful":7,"failed":0},"cluster_name":"logserver","timestamp":1517731166673,"status":"yellow","indices":{"count":506,"shards":{"total":2090,"primaries":2090,"replication":0.0,"index":{"shards":{"min":1,"max":5,"avg":4.130434782608695},"primaries":{"min":1,"max":5,"avg":4.130434782608695},"replication":{"min":0.0,"max":0.0,"avg":0.0}}},"docs":{"count":510863423,"deleted":468685},"store":{"size":"310.3gb","size_in_bytes":333253899037,"throttle_time":"0s","throttle_time_in_millis":0},"fielddata":{"memory_size":"4.2kb","memory_size_in_bytes":4392,"evictions":0},"query_cache":{"memory_size":"8.5mb","memory_size_in_bytes":8934789,"total_count":458244,"hit_count":306880,"miss_count":151364,"cache_size":1078,"cache_count":2472,"evictions":1394},"completion":{"size":"0b","size_in_bytes":0},"segments":{"count":21355,"memory":"1.6gb","memory_in_bytes":1753280287,"terms_memory":"1.2gb","terms_memory_in_bytes":1356601416,"stored_fields_memory":"103.5mb","stored_fields_memory_in_bytes":108602544,"term_vectors_memory":"0b","term_vectors_memory_in_bytes":0,"norms_memory":"460.9kb","norms_memory_in_bytes":472000,"points_memory":"12.6mb","points_memory_in_bytes":13311835,"doc_values_memory":"261.5mb","doc_values_memory_in_bytes":274292492,"index_writer_memory":"43.1mb","index_writer_memory_in_bytes":45243776,"version_map_memory":"9.9mb","version_map_memory_in_bytes":10395017,"fixed_bit_set":"86.5kb","fixed_bit_set_memory_in_bytes":88584,"max_unsafe_auto_id_timestamp":1517413012754,"file_sizes":{}}},"nodes":{"count":{"total":7,"data":1,"coordinating_only":3,"master":3,"ingest":1},"versions":["5.6.3"],"os":{"available_processors":40,"allocated_processors":40,"names":[{"name":"Linux","count":7}],"mem":{"total":"57.3gb","total_in_bytes":61590376448,"free":"1.2gb","free_in_bytes":1341280256,"used":"56.1gb","used_in_bytes":60249096192,"free_percent":2,"used_percent":98}},"process":{"cpu":{"percent":12},"open_file_descriptors":{"min":334,"max":5136,"avg":1021}},"jvm":{"max_uptime":"58.8d","max_uptime_in_millis":5081490657,"versions":[{"version":"1.8.0_111","vm_name":"OpenJDK 64-Bit Server VM","vm_version":"25.111-b15","vm_vendor":"Oracle Corporation","count":7}],"mem":{"heap_used":"11.8gb","heap_used_in_bytes":12722490824,"heap_max":"31.6gb","heap_max_in_bytes":34037170176},"threads":697},"fs":{"total":"2.1tb","total_in_bytes":2403531415552,"free":"1.8tb","free_in_bytes":2009820860416,"available":"1.7tb","available_in_bytes":1902446678016,"spins":"true"},"plugins":[{"name":"x-pack","version":"5.6.3","description":"Elasticsearch Expanded Pack Plugin","classname":"org.elasticsearch.xpack.XPackPlugin","has_native_controller":true}],"network_types":{"transport_types":{"security4":7},"http_types":{"security4":7}}}}

Christian_Dahlqvist · February 4, 2018, 8:02am

Given the number of data nodes you have and the amount of heap assigned to these, you seem to have too many shards. Read this blog post for guidance on how many shards you should have in your cluster.

Amin1 · February 4, 2018, 8:07am

172.24.69.21 7 98 0 0.00 0.02 0.05 - - es-client-03
172.24.69.14 12 98 0 0.00 0.02 0.05 - - es-client-02
172.24.69.16 81 97 33 5.48 5.30 5.24 di - es-data-01
172.24.69.13 12 98 0 0.00 0.04 0.05 - - es-client-01
172.24.69.20 58 98 2 0.01 0.05 0.05 m * es-master-02
172.24.69.19 16 98 0 0.00 0.01 0.05 m - es-master-01
172.24.69.22 14 98 0 0.00 0.01 0.05 m - es-master-03
I have 3 data node but 2 of them failed causes of fatal error
this is my JVM config

## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms8g
-Xmx8g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

## optimizations

# disable calls to System#gc
-XX:+DisableExplicitGC

# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch

## basic

# force the server VM (remove on 32-bit client JVMs)
-server

# explicitly set the stack size (reduce to 320k on 32-bit client JVMs)
-Xss1m

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
-Djna.nosys=true

# use old-style file permissions on JDK9
-Djdk.io.permissionsUseCanonicalPath=true

# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0

# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Dlog4j.skipJansi=true

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps
# ensure the directory exists and has sufficient space
#-XX:HeapDumpPath=${heap.dump.path}

## GC logging

#-XX:+PrintGCDetails
#-XX:+PrintGCTimeStamps
#-XX:+PrintGCDateStamps
#-XX:+PrintClassHistogram
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime

# log GC status to a file with time stamps
# ensure the directory exists
#-Xloggc:${loggc}

# By default, the GC log file will not rotate.
# By uncommenting the lines below, the GC log file
# will be rotated every 128MB at most 32 times.
#-XX:+UseGCLogFileRotation
#-XX:NumberOfGCLogFiles=32
#-XX:GCLogFileSize=128M

# Elasticsearch 5.0.0 will throw an exception on unquoted field names in JSON.
# If documents were already indexed with unquoted fields in a previous version
# of Elasticsearch, some operations may throw errors.
#
# WARNING: This option will be removed in Elasticsearch 6.0.0 and is provided
# only for migration purposes.
#-Delasticsearch.json.allow_unquoted_field_names=true

Christian_Dahlqvist · February 4, 2018, 8:09am

I believe you need to either add more heap or reduce the mount of heap used, e.g. by reducing the shard count. As your average shard size looks quite small, I would however first look at reducing the number of shards.

system · March 4, 2018, 8:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Node goes down showing fatal error in network layer , thread and java heapspace error Elasticsearch	19	1311	August 12, 2019
Java.lang.OutOfMemoryError causing cluster to fail Elasticsearch	2	647	July 6, 2017
Fatal error on the network layer Elasticsearch	11	4392	August 30, 2017
All data nodes died on the cluster Elasticsearch	7	1619	April 17, 2019
Java.lang.OutOfMemoryError: Java heap space Elasticsearch	3	581	July 6, 2017

Fatal error in network and heap

Related topics