Can we index incremental data for files using FSCrawler?

Hello,

Currently i am using url parameter from .settings.yaml file. in url i have mentioned the path of the drive from which files are getting indexed.
if any new file is added into the file system and if i want to index only that file and add into the old file index. is that possible using FSCrawler?

Kindly guide.

Regards,
Priyanka

That's the default behavior of FSCrawler.

Hello @dadoonet,

Thanks for your reply!!!!
That means if any new file is added to drive and i have run the fSCrawler job, then it will index only that file. It will not index all the other files and will create duplicate entries. Am i right? correct me if i am wrong.

Reagrds,
Priyanka

That's correct.

Hello @dadoonet,

Yes, thanks for your help!!!
I have tried this solution and it works.
another question, can we schedule FSCrawler job which currently i am running it manually?
and can we provide more than one file URL paths to create index?

Regards,
Priyanka

Once started it runs every 15 minutes by default. You can change this with https://fscrawler.readthedocs.io/en/latest/admin/fs/local-fs.html

Hello @dadoonet,

I can see that my job settings file has Update rate as 15m by default.
If any new changes are there it will run atomatically after 15m? or we have to run the FSCrawler job every time using cmd? anyways running through cmd will index the newly added document.
clear me on this.

Regards,
Priyanka

It should detect any new change every 15 minutes.

Hello @dadoonet,

Yes it has run successfully. I was using --loop 1 while running FSCrawler job.
Thanks for your help!!

Regards,
Priyanka

Yeah. loop 1 means it runs once and exits.

Hello @dadoonet,

Thanks! one more question, is there any way we can give more than one file system URL using FSCrawler job? so that we can index files from another file systems also.

Regards,
Priyanka

It needs to be another job for now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.