Its now killing my computer space. I'm in the midst of ingesting 537 csv files (total 10.7GB) into elasticsearch.
C:\Users\ethan\AppData\Local\Docker\wsl\data
I'm not sure why it became so big.
Its now killing my computer space. I'm in the midst of ingesting 537 csv files (total 10.7GB) into elasticsearch.
C:\Users\ethan\AppData\Local\Docker\wsl\data
I'm not sure why it became so big.
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 3 2 3.057GB 766.5MB (25%)
Containers 3 2 1.32GB 0B (0%)
Local Volumes 3 3 2.02GB 0B (0%)
Build Cache 0 0 0B 0B
My "C:\Users\ethan\AppData\Local\Docker\wsl\data\ext4.vhdx" being 25.6GB huge when I'm merely ingesting 492MB worth of csv files?
I checked my configs in logstash.yml and logstash.conf.
I bind mounted these files to logstash to be piped in.
I have tweaked things. Now I am only ingested 1 month worth of csv files to investigate.
I did docker system df and i see that my Local Volumes are growing in sizze. Nothing can be relcaimed in space.
Within logstash, I can confirm bind mounts to my csv files.
However, I searched within the ES file directory. Can't find anything big where my datalake resides.
So from 8.5GB -> 14.6GB and counting.
========================================
I have tried shutting down docker and killing off vmmem.wsl
so that I can use diskpart
to compact vdisk
. I could cut the file size fo 8.5GB.
However when docker restarts, the file grows again.
THis is not good. Im not sure what else to troubleshoot. Pls help me.
At 1 million rows ingested, the disk increased a little slower from 10.5 GB -> 13.5 GB
I did some improvements to streamline how the mapping is done on the columns.
Disk is growing still...but it does feel growing a little slower.
Not sure how much more can I improve on things to make the Disk grow minimally.
Now I am only ingesting 30 days of csv file. Each day about 137000 rows of data.
Eventually I want to ingest 18 months worth of csv files of 137000 rows of Data.
I only have about 200GB of free space to work with. I dont want to burst my file space budget.
I might even, have more csv files to ingest in future - of other data sets.
PS: My files ingestion stopped midway. But the disk is still growing now at 25 GB! This is astonishing!
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.