Using Debian 10, Elasticsearch7,Java jdk 11 and FScrawler, when I run the crawler, it only index the files in /tmp/es
directory at first lunch after first setup of _settings.yaml
.
At first initialize, it seems good cause it index all .pdf
files in the url
truely.
But after first lunch (which creates indices in Elasticsearch) adding more files to url
directory, is not added/seen by the crawler. Even stoppnig and restarting the fscrawler, does not results in adding/indexing new files , unless I run ./fscrawler resumes --restart
that results in indexing recently added files to the url
This is _settings.yaml
---
name: "resumes"
fs:
url: "/tmp/es"
update_rate: "3m"
excludes:
- "*/~*"
json_support: false
filename_as_id: false
add_filesize: true
remove_deleted: true
add_as_inner_object: false
store_source: false
index_content: true
attributes_support: false
raw_metadata: false
xml_support: false
index_folders: true
lang_detect: false
continue_on_error: false
ocr:
language: "eng"
enabled: true
pdf_strategy: "ocr_and_text"
follow_symlinks: false
elasticsearch:
nodes:
- url: "http://192.168.225.129:9200"
bulk_size: 100
flush_interval: "5s"
byte_size: "10mb"
ssl_verification: true
fscrawler.log:
03:47:22,328 e[32mINFO e[m [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [13.6mb/494mb=2.77%], RAM [178mb/1.9gb=9.03%], Swap [524.2mb/974.9mb=53.77%].
... Starting FS crawler
... FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
//Many warnings about security ...
03:47:23,188 e[33mWARN e[m [o.e.c.RestClient] request [GET http://192.168.225.129:9200/] returned 1 warnings: [299 Elasticsearch-7.15.2-... "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See https://www.elastic.co/guide/en/elasticsearch/reference/7.15/security-minimal-setup.html to enable security."]
...
...
03:47:23,605 e[32mINFO e[m [f.p.e.c.f.FsParserAbstract] FS crawler started for [resumes] for [/home/pdf] every [10s]
...
Is there any config
which I have to make?