Can't avoid swapping on Windows cluster

jthoni · January 22, 2017, 10:45pm

We have a cluster that has three indexes (one main, and two supporting) with 5 shards and one replica. I am using 7 Windows nodes, each of which have 8 cores and 56gb RAM. I have set the heap to use 28gb. I have disabled the paging file in the OS and set mlock to true. No matter what I have tried, I am still seeing swap being used. ElasticHQ is reporting between 19 - 32 mb swap being used on the nodes.

We have found that we have far exceeded the recommended shard size of 30 - 50gb. Our shards are currently 70 - 90 gb. We are gearing up to reindex and we will split out to more shards at that time. The swap is an issue we have had since we started, however (i.e. even when the shards will smaller).

We are seeing some perf issues, and I can't help but be concerned about this metric that ElasticHQ is flagging in red. I would like to try to resolve this, but haven't had any luck yet.

Thanks,
~john

pk.241011 · January 23, 2017, 12:32am

Ignore the below. I thought you had not tried the options.

This might be useful. Disable Swapping

anhlqn · January 23, 2017, 2:59am

mlockall works only on Linux/Unix systems. I think disabling paging file should be enough. Personally, I don't trust the number on ElasticHQ for ES nodes running on Windows.

jthoni · January 23, 2017, 3:06am

The info states that it is reporting: stat.os.swap.used_in_bytes / 1024 / 1024

It states that anything greater than 1 is a warning. It matches with what I see when querying in Sense, so I don't think HQ is misrepresenting anything

forloop · January 23, 2017, 12:32pm

bootstrap.memory_lock in Elasticsearch 2.x and 5.x uses VirtualLock on Windows to lock a specified region of the process's virtual address space into physical memory.

What does Resource Monitor report for the process, specifically, commit and working set memory?

Also, what does

curl -XGET "http://localhost:9200/_nodes?filter_path=**.mlockall"

report?

anhlqn · January 23, 2017, 11:35pm

I guess I mis interpreted this

The first option is to use mlockall on Linux/Unix systems, or VirtualLock on Windows, to try to lock the process address space into RAM, preventing any Elasticsearch memory from being swapped out.
Disable swapping | Elasticsearch Guide [8.11] | Elastic

Anyway, on my clusters I chose to disable paging file entirely, but ElasticHQ still reported bug number in swap.
Is this bootstrap.mlockall: true still valid on 2.x?

forloop · January 24, 2017, 12:27am

Take a look at PR 18669 for the change. It was renamed to remove the association between the Elasticsearch setting and mlockall

what does

curl -XGET "http://localhost:9200/_nodes?filter_path=**.mlockall"

return?

I'm not sure where ElasticHQ is reporting swap from, but also take a look at that.

anhlqn · January 26, 2017, 10:52pm

No, ElasticHQ does not misrepresent anything. It just reports the numbers it gets from GET _nodes/stats, stats.os.swap.used_in_bytes in this case. Before, I've tested both settings in ealier and 2.4.x as mentioned by @forloop

They never work. On my current ES 2.4.0 node, bootstrap.memory_lock: true is set, and paging file is disabled, yet the used swap is still a very big number

mlockall is active

"process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7444,
        "mlockall" : true
      }

still high used swap in _nodes/stats

"swap": {
          "total_in_bytes": 207222677504,
          "free_in_bytes": 17371148288,
          "used_in_bytes": 189851529216
        }

Physical memory is 192GB. Similar high swap is reported on all other Windows ES nodes

forloop · January 27, 2017, 2:09am

What does Resource Monitor report for the Java process, specifically, commit and working set memory? mlockall: true is reported only if the call to set the process working set size via VirtualLock succeeds.

What do you see when monitoring the Page related counters under Performance Monitor:

Page Faults/sec
Pages Input/sec

anhlqn · January 27, 2017, 7:36pm

This is stats for dedicated master node with 4GB heap size

Commit: 4GB
Working set: 4GB
Page faults/sec: 0

Avarage Pages Input/sec: 250 (total, not per process)

forloop · January 31, 2017, 12:23am

This looks like memory lock is set and working; commit and working set are at 4GB, with no page faults.

What about on a data node?

anhlqn · February 2, 2017, 7:39pm

Data node with 30GB heap size

Commit: 30GB
Working set: 40GB
Page faults/sec: 0

Does that mean ES is reporting the wrong swap usage number?

forloop · February 10, 2017, 5:05am

What does Resource Monitor show?

anhlqn · February 10, 2017, 9:22am

The numbers above were from Resource Monitor

forloop · February 13, 2017, 3:51am

Commit is at 30GB, which matches the heap size specified, and page faults/sec is 0, which indicates to me that all of the virtual memory committed is physical memory i.e. RAM.

Working set can be larger than commit as it includes both private and shareable bytes. If you'd like to understand more, you can run VMMap on the process to get a more detailed breakdown.

I hope that helps.

anhlqn · February 13, 2017, 5:24pm

Thanks, I'll try VMMap.

system · March 13, 2017, 5:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.