FSCrawler uses a _status.json file to store the last timestamp that was used when it started to run.
This date is then compared to the files which are crawled again to see if there have been any change.
I hope this clarifies.
FSCrawler uses a _status.json file to store the last timestamp that was used when it started to run.
This date is then compared to the files which are crawled again to see if there have been any change.
I hope this clarifies.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.