Performance memory swapping Windows?

Running ES 1.6.0 on Windows 2008 R2 Java 1.8_45 G1GC Setup

I have 4 nodes and each are: 32 cores 128GB RAM and 5TB Fusion IO
ES_HEAP_SIZE=30GB per node

6 "monthly" indexes of 8 shards + 1 replica
1.3 Billion Docs 13TB (includes backup)
4k-12k doc size avg
Shards are about 150GB each
All doc values

I have 1 fairly big aggregation.

  • Pre filtered by 1 single "user id"
  • Over 6 month period (all indexes)
  • 29 aggregation, none nested
  • All aggs are either sum or avg
  • It's all doc values no field data cache.

When I run this query in Sense it takes about 10 seconds to produce. I guess that ok.

Then I run a load balanced "stress" test to all 4 nodes using JMeter to run a single sum agg, filtered by user and a random single date. That works fine.... but then I go back and I run the Big aggregation above and it takes 600 seconds to complete. I see 1 node IOPS spiking to 10K while 2 others are idle and other running at 2K.

I'm not running any other operations like bulk or anything like that. I wait for one test to finish to run the other. Apart the standard recovery cluster settings I haven't configured anything else it's all default for searches.

I also see that All nodes have almost 100% meme usage so 30GB for ES process and then the rest is mapped files.

I have noticed quite erratic performance with queries lately. Querie sthat used to take 2-3 seconds taking 600 seconds +

I looked at 1 node.
Working Set: 127GB
Memory Primary: 22GB
Commit Size: 41GB
Paged Pool: 7GB
Non Paged: 1GB
Page Faults: 450,000,000
PF Deltas: 15K

Any thoughts on this?

With that amount of data, you probably want more nodes!

That's what I have now, my plan is to recommend to start with 8. I'm also testing at what the capacity is expected to be next year, year and a half.

But is there a technical explanation on what the issue might be on the current setup so I can have something to go on or how I can explain it back to my team and why I need more nodes? I never had to deal so much with memory, memory mapped files, pages, page faults etc... :smile:

Or simply put... If I have a 1TB index (including replicas), I suppose I need 1TB of ram at least if I expect it to fit entirely into a mmaped file right?

Other wise the os will keep trying to page/swap out unused files from cache for new ones?

Why would you want it to exist entirely in RAM though? That is possible, but not really what ES was designed for.