Agreeing with the points @rusty raised: Doc values on by default adds some CPU/IO overhead and some more disk space, translog flushes on every action now (instead of every 5s) and the replica issue.
In addition to that, there was a change at the Lucene layer. Incoming blob of text, but the tl;dr is that Lucene identifies idle resources and utilizes them, making the resource usage look higher when it's really just getting work done faster.
So, in Elasticsearch 1.x, we forcefully throttled Lucene's segment merging process to prevent it from over-saturating your nodes/cluster.
The problem is that a strict threshold is almost never the right answer. If you are indexing heavily, you often want to increase the threshold to let Lucene use all your CPU and Disk IO. If you aren't indexing much, you likely want the threshold lower. But you also want it to be able to "burst" the limit for one-off merges when your cluster is relatively idle.
In Lucene 5.x (used in ES 2.0+), they added a new style of merge throttling that monitors how active the index is, and automatically adjusts the throttle threshold (see https://issues.apache.org/jira/browse/LUCENE-6119, https://github.com/elastic/elasticsearch/pull/9243 and https://github.com/elastic/elasticsearch/pull/9145).
In practice, what this means is that your indexing tends to be faster in ES 2.0+ because segments are allowed to merge as fast as your cluster can handle, without over-saturating your cluster. But it also means that your cluster will happily use any idle resources, which is why you see more resource utilization.
Basically, Lucene identified that those resources weren't being used...so it put them to work to finish the task faster.