How to start with fscrawler?


(Technical Stuffer S U Khan ) #1

actually i have installed fscrawler 2.5 zipped file and after unzipping it ,when i opened the (.bat) file it could not opened and from there i am unable to understand whats going wron.....


(David Pilato) #2

Run fscrawler from the command line. It will probably tell you something.


(Technical Stuffer S U Khan ) #3

Pls give me the commands to run.....
I HV typed fscrawler on the command line but it's showing errors


(David Pilato) #4

Which errors?


(Technical Stuffer S U Khan ) #5

this is the output i am getting.....


(David Pilato) #6

What you do not understand in the message?
Please read: https://fscrawler.readthedocs.io/en/fscrawler-2.5/user/getting_started.html#start-fscrawler


(Technical Stuffer S U Khan ) #7

sir, now getting this error......fscrawler stopped


(David Pilato) #8

Please don't post images of text as they are hardly readable and not searchable.

Instead paste the text and format it with </> icon. Check the preview window.

Try to run with --debug option.

Most likely elasticsearch is not started here.


(Technical Stuffer S U Khan ) #9

<fscrawler --config_dir ./jp catalogs/>
sir this command is not working .....

and elastic serach is running already


(David Pilato) #10

Share the logs that you are getting in debug mode please.
Share your FSCrawler configuration json file as well for the catalog job.
Share your elasticsearch logs as well.

And please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.


(Technical Stuffer S U Khan ) #11

Json file-

{
"name" : "catalogs",
"fs" : {
"url" : "C:\tmp\jp",
"update_rate" : "15m",
"excludes" : [ "/~" ],
"json_support" : false,
"filename_as_id" : false,
"add_filesize" : true,
"remove_deleted" : true,
"add_as_inner_object" : false,
"store_source" : false,
"index_content" : true,
"attributes_support" : false,
"raw_metadata" : true,
"xml_support" : false,
"index_folders" : true,
"lang_detect" : false,
"continue_on_error" : false,
"pdf_ocr" : true,
"ocr" : {
"language" : "eng"
}
},
"elasticsearch" : {
"nodes" : [ {
"host" : "127.0.0.1",
"port" : 9200,
"scheme" : "HTTP"
} ],
"bulk_size" : 100,
"flush_interval" : "5s",
"byte_size" : "10mb"
},
"rest" : {
"scheme" : "HTTP",
"host" : "127.0.0.1",
"port" : 8080,
"endpoint" : "fscrawler"
}
}


(David Pilato) #12

Elasticsearch logs please?

And please format your code as I just described.


(David Pilato) #13

And FSCrawler logs?


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.