Hi,
I am trying to work on indexing attachments through fscrwaler using SSH (22).
I have elastic search installed on one machine & trying to read through data from other machine in the same network, but getting a file does not exist error.
This is my settings.yaml
---
name: "new_attachment"
server:
hostname: "**************"
port: 22
username: "**********"
password: "**************"
protocol: "ssh"
fs:
url: "\\E\\foldertoindex"
update_rate: "15m"
excludes:
- "*/~*"
json_support: false
filename_as_id: false
add_filesize: true
remove_deleted: true
add_as_inner_object: false
store_source: false
index_content: true
attributes_support: false
raw_metadata: false
xml_support: false
index_folders: true
lang_detect: false
continue_on_error: false
ocr:
language: "eng"
enabled: true
pdf_strategy: "ocr_and_text"
follow_symlinks: false
elasticsearch:
nodes:
- url: "http://*************:9200"
bulk_size: 100
flush_interval: "5s"
byte_size: "10mb"
Error is as below:
15:21:50,215 DEBUG [f.p.e.c.f.c.v.ElasticsearchClientV7] wait for yellow health
on index [new_attachment_folder]
15:21:50,223 TRACE [f.p.e.c.f.c.v.ElasticsearchClientV7] health response: {"clus
ter_name":"elasticsearch","status":"green","timed_out":false,"number_of_nodes":3
,"number_of_data_nodes":3,"active_primary_shards":1,"active_shards":2,"relocatin
g_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_sh
ards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_wait
ing_in_queue_millis":0,"active_shards_percent_as_number":100.0}
15:21:50,227 DEBUG [f.p.e.c.f.FsParserAbstract] creating fs crawler thread [new_
attachment] for [E\foldertoindex] every [15m]
15:21:50,231 INFO [f.p.e.c.f.FsParserAbstract] FS crawler started for [new_atta
chment] for [E\foldertoindex] every [15m]
15:21:50,231 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler thread [new_attachmen
t] is now running. Run #1...
15:21:50,231 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Opening SSH connection to ******@***********
15:21:50,815 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] SSH connection successful
15:21:50,819 WARN [f.p.e.c.f.FsParserAbstract] Error while crawling E:\foldertoindex: E:\foldertoindex
doesn't exists.
15:21:50,819 WARN [f.p.e.c.f.FsParserAbstract] Full stacktrace
java.lang.RuntimeException: E:\foldertoindex doesn't
exists.
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstr
act.java:130) [fscrawler-core-2.7-SNAPSHOT.jar:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
15:21:50,827 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is going to sleep for
15m
15:22:04,786 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [new_attachment]
15:22:04,786 DEBUG [f.p.e.c.f.FsCrawlerImpl] FS crawler thread is still running
15:22:04,786 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler is now waking up agai
n...
15:22:04,786 DEBUG [f.p.e.c.f.FsParserAbstract] FS crawler thread [new_attachmen
t] is now marked as closed...
java.lang.Exception: Stack trace
at java.base/java.lang.Thread.dumpStack(Thread.java:1387)
at fr.pilato.elasticsearch.crawler.fs.FsCrawlerImpl.close(FsCrawlerImpl.
java:140)
at fr.pilato.elasticsearch.crawler.fs.cli.FSCrawlerShutdownHook.run(FSCr
awlerShutdownHook.java:39)
Terminate batch job (Y/N)? Y
Regards,
Umesh