How could an index size inflate 3 times larger than raw data size?

I have an elasticsearch-1.7 cluster and ingest about 800GB log data into an index. Then I found the /_cat/indices API show me the index size was 2.2TB.

The number_of_replicas of this index has already been set to 0. The _all field set to be disable too. So, how could the index size inflate so much ?! Nearly 3 times.

Most of the log data haven't extract new fields. They only have some meta field and one raw message. only 10% of the data was JSON format that may have some fields.


I had heard some information that elasticsearch-1.x may use more space after segments merge if multiple _type have some field but typeA has a little doc with long length and typeB has lots of doc with small length.

Is this possible problem keeping the same in only one _type in one index? Is this resolved in elasticsearch-2.3?

What's the mapping look like?