Is there any recommendation for the amount of training data to have available for ML?
We currently store one week's worth of data in Elasticsearch, which totals about 100GB of total storage. We have a process that cleans up any indexes older than a week, but want to begin taking advantage of the ML capabilities and know we will need to pump up the disk space we have to retain more data.
What we're unsure of is exactly how much data we need to retain. We currently store data associated with retail transactions and web traffic and have day-of-the-week, day-of-the-month, and monthly (seasonal) trends. What would be the recommended retention for data of this nature to take advantage of ML?
Thanks in advance!