I have a single node 5.1.2 cluster with a filebeat index on. I'm seeing some pretty big differences with the size reported by Elasticsearch and size on disk. E.g.
curl localhost:9200/_cat/indices/filebeat-2017.01.31?v
health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-2017.01.31 KQSgom68Qqio4-KOAwW-UQ   5   0    3082004            0    890.3mb        890.3mb
But on the OS:
du -hs nodes/0/indices/KQSgom68Qqio4-KOAwW-UQ
3.3G    nodes/0/indices/KQSgom68Qqio4-KOAwW-UQ
Am I missing something here?
EDIT: Wait... I think I just figured it out, it's the translog.