Hello Expert, after some internet search, the best I found it this link: https://github.com/dadoonet/fscrawler. However from its own description, it is only for local file systems on Windows,
It would be really appreciated if you could shed some lights on this since we are new to this area.
I see, thanks @dadoonet, great to see your reply. Just wonder if you have any performance matrix for fscrawler. We are thinking to use it on terabytes of windows file share. Or anything you think we need to pay attention?
I'd just say that it has not been optimized at all. Like it's single threaded so you'd better launch multiple instances in parallel one per the first subdir you have for instance.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.