Thread Pools/Hot Thread Help

I am trying to get additional information on thread pools/hot threads.

I am currently looking for any documentation that may give some more detailed information on what these terms/stats mean in the out put.

If I run the hot thread command I can see the node.
Its node roles are listed as {dilm}
theres then a portion: ml.machine_memory= < insert very large number here>, ml.max_open_jobs=20

  1. we do not have any machine learning jobs running. Does elastic still keep memory open on the nodes for ML even if we do not have any ML jobs active?
  2. Do we have to remove the ML node role in order to remove that ml.machine_memory from the node. Will the removal of this memory increase performance?

Also the hot thread API will give information about the specific thread and how much CPU it is using.
Ex: 91.9% (459.7 ms out of 500ms) cpu used by thread [T#1]
4/10 snapshots sharing following 41 elements.

  1. what are considered "snapshots" here?
  2. The 41 elements that follow this section are a little difficult to read. I am wondering if there is any documentation that can provide some insight to how to read these "elements"

Thanks so much

Specific to your ML questions:

  1. No, if there are no ML jobs, there is no "extra" memory reserved for ML. ML only consumes memory when there are active ML jobs running - and it does so outside of the JVM heap anyway.
  2. Because of 1. above, you don't have to remove the ML node role and even if you did it wouldn't change anything.

I will move this question to the elasticsearch tag so that you can get help there.

1 Like

Hot threads looks at activity over points in time to measure what's running when and what's taking resources at those times. So a snapshot is one of those time intervals.

There's not, no. If you have specific questions about that output then I would create a new thread :slight_smile:

1 Like

One quick follow up:

For the snapshot: how frequent are those intervals when Elastic will take a snapshot? And what happens when the snapshot reaches 10/10? Does it just recycle back to 0?

I'll start a new post regarding the elements section.

Thanks.
One quick follow up:

If there is no "extra" memory reserved for ML, can you elaborate on what that number (ml.machine_memory) is and why it is listed there?

If there is no "extra" memory reserved for ML, can you elaborate on what that number (ml.machine_memory) is and why it is listed there?

It literally is the memory of the host itself. Nothing to do with ML, really. Other than that in the reporting of internal settings, having knowledge of the overall size of the host is often useful if you're debugging memory issues with respect to ML.