We are planning to deploy FS Crawler to the VM on Azure, and I am trying to provisioning the VM. Here is the information to support the provision:
- OS: Ubuntu 20.4
- PDF files total size: 1.5 TB
- FS Crawler Schedule: Set a OS daily cron job to run FS Crawler once per day
As per FS Crawler documentation (Tips and tricks — FSCrawler 2.10-SNAPSHOT documentation), it will generate huge temporary files, and we can set cron job to do cleanup periodically. This raise a concern when I do the provisioning the VM on cloud. Can anyone recommend what kind of configuration I need for the VM, such as number of CPU cores, storage?
Any help will be greatly appreciated.