As per the documentation it is recommended by Elasticsearch Team that every Elasticsearch node should have the memory slightly less than 32GB. Now My question is that does this apply to a dedicated Machine learning Node as well. And Even if we give more than 32GB memory to the Dedicated Machine Learning Node what might be the repercussions of that.
The recommendation is that the JVM heap size for Elasticsearch should be less than 32GB. This is so that the JVM can use compressed pointers. A JVM heap of 33GB will use space more wastefully than a JVM heap of 31GB.
The JVM heap size is different to the size of the machine/VM/container where the software is running. It's certainly possible and desirable at large scale for that to be bigger than 32GB.
So this question comes down to what you mean by "node". The Elastic docs are a bit confusing in this regard. Sometimes "node" is used to mean a JVM running Elasticsearch and sometimes "node" is used to mean the whole machine/VM/container that it's running on.
If you mean JVM heap size then no, you shouldn't give an ML JVM heap more than 32GB. In fact, ML nodes use most memory outside the JVM so the JVM on an ML node should be smaller than on a data node.
But you'll be able to run more native ML processes with more memory, so you can certainly have machines/VMs/containers for ML that are bigger than 32GB.
Hello,
We followed this configuration and gave less than 31GB to JVM heap now we have configured one ML job and observed that while the job is running we were able to monitor some changes in JVM heap utilization in kibana stack monitoring. But we didn't see any change in the total used RAM of our machine.
As per the documentation the ML job (ml processes) uses memory outside of JVM heap. Now we first observed the ram used while the job was in closed state and again while the job was in opened state and the datafeed was running, in both cases we were not able to see any changes in the used RAM figure. Can you explain this why is this happpening and why are we only able to see changes in JVM heap utilization and not in total RAM used.
Try using top. While jobs are running you should see autodetect processes in the list of processes that are using CPU. And top will also show that they are using memory as well.
Yes we did that . When our ml job was in opened state and when it was in closed state we examined the processes using that command. In both cases the figure was unchanged. But we could see some changes in kibana stack monitoring in case of jvm heap. But there was no change in the total ram used. As per my understanding the ml process should use the memory outside of jvm heap. so we should expect some changes in total ram used.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.