Performance memory swapping Windows?

(None) #1

Running ES 1.6.0 on Windows 2008 R2 Java 1.8_45 G1GC Setup

I have 4 nodes and each are: 32 cores 128GB RAM and 5TB Fusion IO
ES_HEAP_SIZE=30GB per node

6 "monthly" indexes of 8 shards + 1 replica
1.3 Billion Docs 13TB (includes backup)
4k-12k doc size avg
Shards are about 150GB each
All doc values

I have 1 fairly big aggregation.

  • Pre filtered by 1 single "user id"
  • Over 6 month period (all indexes)
  • 29 aggregation, none nested
  • All aggs are either sum or avg
  • It's all doc values no field data cache.

When I run this query in Sense it takes about 10 seconds to produce. I guess that ok.

Then I run a load balanced "stress" test to all 4 nodes using JMeter to run a single sum agg, filtered by user and a random single date. That works fine.... but then I go back and I run the Big aggregation above and it takes 600 seconds to complete. I see 1 node IOPS spiking to 10K while 2 others are idle and other running at 2K.

I'm not running any other operations like bulk or anything like that. I wait for one test to finish to run the other. Apart the standard recovery cluster settings I haven't configured anything else it's all default for searches.

I also see that All nodes have almost 100% meme usage so 30GB for ES process and then the rest is mapped files.

I have noticed quite erratic performance with queries lately. Querie sthat used to take 2-3 seconds taking 600 seconds +

I looked at 1 node.
Working Set: 127GB
Memory Primary: 22GB
Commit Size: 41GB
Paged Pool: 7GB
Non Paged: 1GB
Page Faults: 450,000,000
PF Deltas: 15K

(None) #2

Any thoughts on this?

(Mark Walkom) #3

With that amount of data, you probably want more nodes!

(None) #4

That's what I have now, my plan is to recommend to start with 8. I'm also testing at what the capacity is expected to be next year, year and a half.

But is there a technical explanation on what the issue might be on the current setup so I can have something to go on or how I can explain it back to my team and why I need more nodes? I never had to deal so much with memory, memory mapped files, pages, page faults etc... :smile:

(None) #5

Or simply put... If I have a 1TB index (including replicas), I suppose I need 1TB of ram at least if I expect it to fit entirely into a mmaped file right?

Other wise the os will keep trying to page/swap out unused files from cache for new ones?

(Mark Walkom) #6

Why would you want it to exist entirely in RAM though? That is possible, but not really what ES was designed for.

(system) #7