You'll need something that can parse and execute the webpage and keep track of all the content that is downloaded. Headless chrome, or phantom.js, or possibly even a real browser using selenium are possible options (not sure about that last one). That's the hardest part, but once you have that information, you can index it into Elasticsearch and visualize it however you like. You could even visualize each file separately, and look at the size by type (html, css, js, etc).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.