Docker Hard Disk Image File Being Too Large

Its now killing my computer space. I'm in the midst of ingesting 537 csv files (total 10.7GB) into elasticsearch.


I'm not sure why it became so big.


Images 3 2 3.057GB 766.5MB (25%)
Containers 3 2 1.32GB 0B (0%)
Local Volumes 3 3 2.02GB 0B (0%)
Build Cache 0 0 0B 0B

My "C:\Users\ethan\AppData\Local\Docker\wsl\data\ext4.vhdx" being 25.6GB huge when I'm merely ingesting 492MB worth of csv files?

I checked my configs in logstash.yml and logstash.conf.

I bind mounted these files to logstash to be piped in.

I have tweaked things. Now I am only ingested 1 month worth of csv files to investigate.

I did docker system df and i see that my Local Volumes are growing in sizze. Nothing can be relcaimed in space.

Within logstash, I can confirm bind mounts to my csv files.

However, I searched within the ES file directory. Can't find anything big where my datalake resides.

So from 8.5GB -> 14.6GB and counting.


I have tried shutting down docker and killing off vmmem.wsl so that I can use diskpart to compact vdisk. I could cut the file size fo 8.5GB.

However when docker restarts, the file grows again.

THis is not good. Im not sure what else to troubleshoot. Pls help me.


At 1 million rows ingested, the disk increased a little slower from 10.5 GB -> 13.5 GB

I did some improvements to streamline how the mapping is done on the columns.

Disk is growing still...but it does feel growing a little slower.

Not sure how much more can I improve on things to make the Disk grow minimally.

Now I am only ingesting 30 days of csv file. Each day about 137000 rows of data.

Eventually I want to ingest 18 months worth of csv files of 137000 rows of Data.

I only have about 200GB of free space to work with. I dont want to burst my file space budget.

I might even, have more csv files to ingest in future - of other data sets.

PS: My files ingestion stopped midway. But the disk is still growing now at 25 GB! This is astonishing!

