That's odd that you only see the slow down when indexing with nested documents.
Nested documents do require more work to index and merge, since under-the-hood, each of your 1 to 3 nested docs (plus the parent doc) are indexed as separate documents to Lucene, but that work should not be increasing over time.
Can you pull a full hot threads (pass e.g. threads=10000 to https://www.elastic.co/guide/en/elasticsearch/reference/2.1/cluster-nodes-hot-threads.html) from all nodes when things have gotten very slow with the nested documents?