I am trying to work on indexing attachments through fscrwaler using SSH (22).

I have elastic search installed on one machine & trying to read through data from other machine in the same network, but getting a file does not exist error.

This is my settings.yaml

name: "new_attachment"
  hostname: "**************"
  port: 22
  username: "**********"
  password: "**************"
  protocol: "ssh"
  url: "\\E\\foldertoindex"
  update_rate: "15m"
  - "*/~*"
  json_support: false
  filename_as_id: false
  add_filesize: true
  remove_deleted: true
  add_as_inner_object: false
  store_source: false
  index_content: true
  attributes_support: false
  raw_metadata: false
  xml_support: false
  index_folders: true
  lang_detect: false
  continue_on_error: false
    language: "eng"
    enabled: true
    pdf_strategy: "ocr_and_text"
  follow_symlinks: false
  - url: "http://*************:9200"
  bulk_size: 100
  flush_interval: "5s"
  byte_size: "10mb"

Error is as below:

15:21:50,215 DEBUG [f.p.e.c.f.c.v.ElasticsearchClientV7] wait for yellow health
on index [new_attachment_folder]
15:21:50,223 TRACE [f.p.e.c.f.c.v.ElasticsearchClientV7] health response: {"clus
15:21:50,227 DEBUG [f.p.e.c.f.FsParserAbstract] creating fs crawler thread [new_
attachment] for [E\foldertoindex] every [15m]
15:21:50,231 INFO  [f.p.e.c.f.FsParserAbstract] FS crawler started for [new_atta
chment] for [E\foldertoindex] every [15m]
15:21:50,231 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler thread [new_attachmen
t] is now running. Run #1...
15:21:50,231 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Opening SSH connection to ******@***********
15:21:50,815 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] SSH connection successful
15:21:50,819 WARN  [f.p.e.c.f.FsParserAbstract] Error while crawling E:\foldertoindex: E:\foldertoindex
doesn't exists.
15:21:50,819 WARN  [f.p.e.c.f.FsParserAbstract] Full stacktrace
java.lang.RuntimeException: E:\foldertoindex doesn't
        at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstr
act.java:130) [fscrawler-core-2.7-SNAPSHOT.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
15:21:50,827 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is going to sleep for
15:22:04,786 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [new_attachment]

15:22:04,786 DEBUG [f.p.e.c.f.FsCrawlerImpl] FS crawler thread is still running
15:22:04,786 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is now waking up agai
15:22:04,786 DEBUG [f.p.e.c.f.FsParserAbstract] FS crawler thread [new_attachmen
t] is now marked as closed...
java.lang.Exception: Stack trace
        at java.base/java.lang.Thread.dumpStack(Thread.java:1387)
        at fr.pilato.elasticsearch.crawler.fs.FsCrawlerImpl.close(FsCrawlerImpl.
        at fr.pilato.elasticsearch.crawler.fs.cli.FSCrawlerShutdownHook.run(FSCr
Terminate batch job (Y/N)? Y


Have a look at https://fscrawler.readthedocs.io/en/latest/admin/fs/ssh.html#windows-drives

The URL should be:

url: "/E:/foldertoindex"
Hi David,

Thanks the solution did work, but I am getting the following error now.

16:59:20,460 INFO [f.p.e.c.f.FsParserAbstract] FS crawler started for [new_atta
chment] for [/E:/foldertoindex] every [15m]
16:59:20,460 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler thread [new_attachmen
t] is now running. Run #1...
16:59:20,464 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Opening SSH connection to D
16:59:21,225 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] SSH connection successful
16:59:21,269 WARN [f.p.e.c.f.FsParserAbstract] Error while crawling /E:/foldert
oindex: begin 0, end -1, length 17
16:59:21,273 WARN [f.p.e.c.f.FsParserAbstract] Full stacktrace
java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 17
at java.lang.String.checkBoundsBeginEnd(String.java:3319) ~[?:?]
at java.lang.String.substring(String.java:1874) ~[?:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.indexDirectory(Fs
ParserAbstract.java:541) ~[fscrawler-core-2.7-SNAPSHOT.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstr
act.java:142) [fscrawler-core-2.7-SNAPSHOT.jar:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
16:59:21,281 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is going to sleep for


It looks like a bug. Could you share the exact configuration file? You can remove credentials from it.

Could you open an issue with those details?

Hi David,

Thank you I will open a new issue with the details.


