Here is some more info on further stuff I have tried in the meantime (sorry if my examples use the Python API and not the classic CURL way). Which is just excluding a single field from the _source:
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': '127.0.0.1', 'port': 9200}])
body = {"settings":
{"index.mapping.ignore_malformed": "true"},
"mappings":
{"reports":{"_all":{"enabled": "false"},
"_source":{"excludes":["data"]}, ... (other mappings)
es.indices.create(index='test', body=body)
Then I index a document to "reports"
report = open("somepath/report.json",'rb').read()
print es.index(index='test', doc_type="reports", body=report)
which gives me some ID which I use to check the index and source:
import json
print json.dumps(es.get('test',id="ID"),indent=4, sort_keys=True)
print json.dumps(es.get_source('test',doc_type="reports",id="ID"),indent=4, sort_keys=True)
Now if I look at the output of the get_source, the "data" part is still there.