Having read through https://www.elastic.co/guide/en/elasticsearch/reference/5.x/mapping-source-field.html a few times for different versions, I have to wonder what are the less obvious impacts of storing _source besides increased disk utilization?
_source is only pulled back on a get, correct?
What's the memory impact of using _source? If my ES node has a shard loaded in RAM, does that shard pull into RAM all _source fields it holds? Or will the _source field be kept on disk until the get occurs?
How does the inverted index relate to the _source field? Non-issue?
Since _source appears to be stored as JSON, how would binary document formats be handled when storing _source? Converted to JSON? Stored as some sort of blob in the JSON body (encoded or some other way)?
Just curious here as I'm considering the impact of making this change across all of the customer use cases I have, from document heavy (PDF, Word, Excel, etc.) to memory constrained.