Heap Size setting in k8s

Hi All,

I am trying to setup up elasticsearch(version 6.8.14) on k8s using eck operator. As per the documentation HEAP SIZE should be 50% of RAM.

  1. so If set ( requests, limits as 16GB) then ES_JAVA_OPTS should be 8 i.e Xms & xms should be 8GB. is my understanding correct?

  2. what happens if I allocate more Heapsize( for ex: 14GB Heapsize out of 16GB total RAM)? Does this configurations slows down querying or data ingestion?

Hi,

  1. Yes
  2. Both the JVM and Elasticsearch require native memory to store data outside of the heap. The JVM process can be killed by the OOM Killer if a memory allocation in the non-heap space exceeds the memory limit.

Hi Micheal,

Thanks for detailed info.

In Mycase Elasticsearch data node pods are keep restarting, from the logs I figured out the below error.

Error

org.elasticsearch.transport.ConnectTransportException: [node-3][10.244.8.2:9300] connect_exception

at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1309) ~[elasticsearch-6.8.14.jar:6.8.14]

at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:101) ~[elasticsearch-6.8.14.jar:6.8.14]

at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-6.8.14.jar:6.8.14]

at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]

at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]

at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]

at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]

at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-6.8.14.jar:6.8.14]

at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$new$1(Netty4TcpChannel.java:72) ~[?:?]

at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511) ~[?:?]

at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504) ~[?:?]

at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483) ~[?:?]

at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424) ~[?:?]

at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121) ~[?:?]

at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327) ~[?:?]

at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343) ~[?:?]

at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[?:?]

at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) ~[?:?]

at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) ~[?:?]

at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[?:?]

at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[?:?]

at java.lang.Thread.run(Thread.java:832) [?:?]

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: 10.244.8.2/10.244.8.2:9300

at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]

at sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:875) ~[?:?]

at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]

at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]

... 6 more

Caused by: java.net.ConnectException: Connection refused

at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]

at sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:875) ~[?:?]

at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]

at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]

... 6 more


Not sure is what is causing this? I have ran below api call to see the cpu,ram specifications ?

https://es:9200/_cat/nodesv&h=ip,heap.current,heap.percent,heap.max,ram.current,ram.percent,ram.max

output below:

looks like RAM has been completely utilized
Total RAM - 16GB
HEAPSIZE - 14GB

Is More heapsize & less non-heapsize causing data nodes pods to restart?

Hard to say without knowing if the process is killed by the Kernel or exiting by itself.
Keep in mind that by default the JVM is started with AlwaysPreTouch which means that even if all the heap is not used the memory cannot be allocated for any other purpose. In your case it means that there is only 2GB of memory remaining to allocate some pages outside of the heap. I would first try to follow the recommendations and see if it prevents nodes from being restarted.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.