Elasticsearch data node out of memory

Hello everyone,
We are having out of memory issue for the elasticsearch data nodes? Can you please help me out to find the issue.
Here is log from elasticsearch cluster.

[2023-02-17 10:02:47,551][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
at org.elasticsearch.search.sort.SortParseElement.addSortField(SortParseElement.java:213)
at org.elasticsearch.search.sort.SortParseElement.addCompoundSortField(SortParseElement.java:187) at org.elasticsearch.search.sort.SortParseElement.parse(SortParseElement.java:95)
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:838)
... 12 more
[2023-02-17 06:35:35,197][DEBUG][action.search ] [dn8] All shards failed for phase: [query]
: {"excludes": , "includes": ["ingdt"]}, "from": 0, "size": 1}]]; nested: SearchParseException[No mapping found for [ingdt] in order to sort on];[{"sort": {"ingdt": {"order": "desc"}}, "query": {"boolseException[No mapping found for [ingdt] in order to sort on];[{"sort": {"ingdt": {"order": "desc"}}, "query"at org.elasticsearch.search.SearchService.parseSource(SearchService.java:855)es": ["ingdt"]}, "from":at org.elasticsearch.search.SearchService.createContext(SearchService.java:654)
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:620)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:371)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReat org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReat org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:3at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.jat org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandlerat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: SearchParseException[No mapping found for [ingdt] in order to sort on]
at org.elasticsearch.search.sort.SortParseElement.addCompoundSortField(SortParseElement.java:187) ... 12 moreticsearch.search.SearchService.parseSource(SearchService.java:838):95)
[2023-02-17 10:00:11,868][INFO ][monitor.jvm ] [dn8] [gc][old][157114][139] duration [6.9s], collections [1]/[7.4s], total [6.9s]/[6.3m], memory [12.4gb]->[9gb]/[31.8gb], all_pools {[young]8.1mb]->[0b]/[108.1mb]}{[old] [12.3gb]->[8.9gb]/[30.9gb]}
[2023-02-17 10:00:28,157][INFO ][monitor.jvm ] [dn8] [gc][old][157121][140] duration [9.2s], collections [1]/[9.9s], total [9.2s]/[6.4m], memory [28.9gb]->[31.1gb]/[31.8gb], all_pools {[you] [108.1mb]->[0b]/[108.1mb]}{[old] [28.6gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:00:36,278][INFO ][monitor.jvm ] [dn8] [gc][old][157122][141] duration [8s], collections [1]/[8.1s], total [8s]/[6.6m], memory [31.1gb]->[31.6gb]/[31.8gb], all_pools {[young] b]->[0b]/[108.1mb]}{[old] [30.9gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:00:44,582][INFO ][monitor.jvm ] [dn8] [gc][old][157123][142] duration [8.2s], collections [1]/[8.3s], total [8.2s]/[6.7m], memory [31.6gb]->[31.8gb]/[31.8gb], all_pools {[youor] [0b]->[69.5mb]/[108.1mb]}{[old] [30.9gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:02:47,551][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:04:13,535][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:02:54,144][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:02:47,551][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
.2s], collections [1]/[9.9s], total [9.2s]/[6.4m], memory [28.9gb]->[31.1gb]/[31.8gb], all_pools {[young] [187.3mb]->[177mb]/[865.3mb]}{[survivor] [108.1mb]->[0b]/[108.1mb]}{[old] [28.6gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:00:36,278][INFO ][monitor.jvm ] [dn8] [gc][old][157122][141] duration [8s], collections [1]/[8.1s], total [8s]/[6.6m], memory [31.1gb]->[31.6gb]/[31.8gb], all_pools {[young] [177mb]->[740.2mb]/[865.3mb]}{[survivor] [0b]->[0b]/[108.1mb]}{[old] [30.9gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:00:44,582][INFO ][monitor.jvm ] [dn8] [gc][old][157123][142] duration [8.2s], collections [1]/[8.3s], total [8.2s]/[6.7m], memory [31.6gb]->[31.8gb]/[31.8gb], all_pools {[young] [740.2mb]->[865.3mb]/[865.3mb]}{[survivor] [0b]->[69.5mb]/[108.1mb]}{[old] [30.9gb]->[30.9gb]/[30.9gb]}
[2023-02-17 10:02:47,551][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:04:13,535][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:02:54,144][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2023-02-17 10:02:47,551][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space

Which version of Elasticsearch are you using?

What is the full output of the cluster stats API?

What is the size and hardware specification of the cluster?

Cluster and version:
{
"timestamp" : 1676663158235,
"cluster_name" : "psw",
"status" : "green",
"indices" : {
"count" : 103,
"shards" : {
"total" : 1100,
"primaries" : 550,
"replication" : 1.0,
"index" : {
"shards" : {
"min" : 5,
"max" : 80,
"avg" : 10.679611650485437
},
"primaries" : {
"min" : 5,
"max" : 40,
"avg" : 5.339805825242719
},
"replication" : {
"min" : 0.0,
"max" : 2.0,
"avg" : 1.0
}
}
},
"docs" : {
"count" : 796821702,
"deleted" : 55324205
},
"store" : {
"size_in_bytes" : 1798260205580,
"throttle_time_in_millis" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 16909107400,
"evictions" : 0
},
"query_cache" : {
"memory_size_in_bytes" : 1685706352,
"total_count" : 179648382,
"hit_count" : 7978360,
"miss_count" : 171670022,
"cache_size" : 61342,
"cache_count" : 63404,
"evictions" : 2062
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 7169,
"memory_in_bytes" : 3236880780,
"terms_memory_in_bytes" : 2728959888,
"stored_fields_memory_in_bytes" : 454045704,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 36443264,
"doc_values_memory_in_bytes" : 17431924,
"index_writer_memory_in_bytes" : 0,
"index_writer_max_memory_in_bytes" : 67071705088,
"version_map_memory_in_bytes" : 0,
"fixed_bit_set_memory_in_bytes" : 3066088
},
"percolate" : {
"total" : 0,
"time_in_millis" : 0,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 0
}
},
"nodes" : {
"count" : {
"total" : 9,
"master_only" : 0,
"data_only" : 6,
"master_data" : 3,
"client" : 0
},
"versions" : [ "2.3.5" ],
"os" : {
"available_processors" : 144,
"allocated_processors" : 144,
"mem" : {
"total_in_bytes" : 10750435328
},
"names" : [ {
"name" : "Linux",
"count" : 9
} ]
},
"process" : {
"cpu" : {
"percent" : 0
},
"open_file_descriptors" : {
"min" : 1788,
"max" : 2021,
"avg" : 1891
}
},
"jvm" : {
"max_uptime_in_millis" : 31383265,
"versions" : [ {
"version" : "1.8.0_181",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.181-b13",
"vm_vendor" : "Oracle Corporation",
"count" : 7
}, {
"version" : "1.8.0_242",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.242-b08",
"vm_vendor" : "Oracle Corporation",
"count" : 2
} ],
"mem" : {
"heap_used_in_bytes" : 90600761640,
"heap_max_in_bytes" : 308217249792
},
"threads" : 1662
},
"fs" : {
"total_in_bytes" : 14266797133824,
"free_in_bytes" : 12238875586560,
"available_in_bytes" : 11637248139264
},
"plugins" : [ {
"name" : "head",
"version" : "master",
"description" : "head - A web front end for an Elasticsearch cluster",
"url" : "/_plugin/head/",
"jvm" : false,
"site" : true
}, {
"name" : "cloud-aws",
"version" : "2.3.5",
"description" : "The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast discovery mechanism and add S3 repositories.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.cloud.aws.CloudAwsPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "delete-by-query",
"version" : "2.3.5",
"description" : "The Delete By Query plugin allows to delete documents in Elasticsearch with a single query.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.deletebyquery.DeleteByQueryPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "sql",
"version" : "2.3.5.0",
"description" : "Query elasticsearch using SQL",
"url" : "/_plugin/sql/",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.nlpcn.SqlPlug",
"isolated" : true,
"site" : true
} ]
}
}

Hardaware: 8X64( 8 cpu's and 64 ram)
Assinged heap size for data node is 32gb(50%)

That is a very, very old version that has been EOL a long time. I have not used that version in many years so do not think I will be able to help much. I would recommend upgrading.

It also seems like you are using a third-party plugin that I have never used and that may very well contribute to heap usage.

Ok, thank you for your help.

I am suspecting that they are running search query against filed ingdt for sorting through out the whole index, that is causing the out of memory issue. See the snippet of log. Are you agree with me?

All shards failed for phase: [query]
: {"excludes": , "includes": ["ingdt"]}, "from": 0, "size": 1}]]; nested: SearchParseException[No mapping found for [ingdt] in order to sort on];[{"sort": {"ingdt": {"order": "desc"}}, "query": {"boolseException[No mapping found for [ingdt] in order to sort on];[{"sort": {"ingdt": {"order": "desc"}}, "query"at org.elasticsearch.search.SearchService.parseSource(SearchService.java:855)es": ["ingdt"]}, "from":at org.elasticsearch.search.SearchService.createContext(SearchService.java:654)
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:620)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:371)

2.X is 7 years old, which means there's likely not a lot of retained memory on debugging this version given we are up to 8.6, and you are unlikely to get much advice other than what Christian mentioned - upgrade ASAP.

Possibly? Again, I don't remember most of what 2.4 did to comment with any assurance.

Also one final comment - please format your code/logs/config using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.