Aggregations in 2.1.0 much slower than 1.6.0

A little - we saw aggregations like below (building global ordinals) were hot and vast majority of the new time in 2.1.1 vs 1.5.2 was in our aggregations, although queries alone were somewhat slower too (~10%), possibly due to stuff like Elastic 2.0 slower query execution speed as 1.3 (and fix for that appears way out (unreleased Lucene 5.5 vs current 5.3.1), or attempt to modify all queries manually). The performance regressions we're seeing are in spite of our new cluster using much faster local SSDs and having no ambient load vs. our production reference instance. Seems like waiting for elasticsearch team to restore performance parity in future versions is most prudent choice.

Our aggs consist of lots of terms some with array-long-excludes and a smattering of filter, match_all, nested, reverse_nested, date_histogram, and stats.

Example hot agg trace:

org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.getLeafCollector(GlobalOrdinalsStringTermsAggregator.java:94)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:132)
       org.elasticsearch.search.aggregations.AggregatorFactory$1$1.collect(AggregatorFactory.java:204)
       org.elasticsearch.search.aggregations.LeafBucketCollector$3.collect(LeafBucketCollector.java:73)
       org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectExistingBucket(BucketsAggregator.java:80)
       org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectBucket(BucketsAggregator.java:72)
       org.elasticsearch.search.aggregations.bucket.terms.LongTermsAggregator$1.collect(LongTermsAggregator.java:98)
       org.elasticsearch.search.aggregations.AggregatorFactory$1$1.collect(AggregatorFactory.java:208)
       org.elasticsearch.search.aggregations.LeafBucketCollector$3.collect(LeafBucketCollector.java:73)