Currently i am using url parameter from .settings.yaml file. in url i have mentioned the path of the drive from which files are getting indexed.
if any new file is added into the file system and if i want to index only that file and add into the old file index. is that possible using FSCrawler?
Thanks for your reply!!!!
That means if any new file is added to drive and i have run the fSCrawler job, then it will index only that file. It will not index all the other files and will create duplicate entries. Am i right? correct me if i am wrong.
Yes, thanks for your help!!!
I have tried this solution and it works.
another question, can we schedule FSCrawler job which currently i am running it manually?
and can we provide more than one file URL paths to create index?
I can see that my job settings file has Update rate as 15m by default.
If any new changes are there it will run atomatically after 15m? or we have to run the FSCrawler job every time using cmd? anyways running through cmd will index the newly added document.
clear me on this.
Thanks! one more question, is there any way we can give more than one file system URL using FSCrawler job? so that we can index files from another file systems also.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.