High cardinality fields and doc values can be a reason like Mark said. For instance if all your values are unique and you have two segments that have 1M unique values each, then the merged segment will have 2M unique values, which requires one more bit per document for addressing. These isn't really anything that can be done about it, this is just the way things are designed.
Another potential reason are sparse fields with doc values. For efficiency reasons, elasticsearch needs to reserve space for documents that don't have a value. Imagine you have 2 segments, segment 1 has values for field foo but segment 2 does not. So field foo does not require any disk space on segment 2, but as soon as you merge those segments, elasticsearch will suddenly need to reserve some space for all documents of segment 2 even though they don't have a value for 'foo'. This is something that we hope to improve soon in the extreme cases (when less than 1% of documents have a value for a given field). You can see https://issues.apache.org/jira/browse/LUCENE-6863 for more information.