Hi,
Can you please share your suggestions for this issue..
I am using FSCrawler v2.6 and my ELK v6.8.14.. when i tried to ingest pdf files, getting below errors..
FYI.. with the same settings.json and data, i was able to run in FSCrawler v2.4.
16:31:16,867 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [2/_settings.json] already exists
16:31:16,870 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [2/_settings_folder.json] already exists
16:31:16,871 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [5/_settings.json] already exists
16:31:16,871 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [5/_settings_folder.json] already exists
16:31:16,872 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings.json] already exists
16:31:16,872 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings_folder.json] already exists
16:31:16,875 DEBUG [f.p.e.c.f.c.FsCrawler] Starting job [processformstemplates]...
**16:31:17,270 WARN [f.p.e.c.f.c.ElasticsearchClientManager] failed to create elasticsearch client, disabling crawler...**
**16:31:17,270 FATAL [f.p.e.c.f.c.FsCrawler] Fatal error received while running the crawler: [null]**
**16:31:17,270 DEBUG [f.p.e.c.f.c.FsCrawler] error caught**
**java.lang.NullPointerException: null**
at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.lambda$buildRestClient$0(ElasticsearchClient.java:333) ~[fscrawler-elasticsearch-client-2.6-SNAPSHOT.jar:?]
at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_181]
at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.buildRestClient(ElasticsearchClient.java:322) ~[fscrawler-elasticsearch-client-2.6-SNAPSHOT.jar:?]
at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClientManager.start(ElasticsearchClientManager.java:91) ~[fscrawler-elasticsearch-client-2.6-SNAPSHOT.jar:?]
at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawler.main(FsCrawler.java:260) [fscrawler-cli-2.6-SNAPSHOT.jar:?]
16:31:17,273 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [processformstemplates]
16:31:17,274 DEBUG [f.p.e.c.f.c.ElasticsearchClientManager] Closing Elasticsearch client manager
16:31:17,274 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
16:31:17,274 INFO [f.p.e.c.f.FsCrawlerImpl] FS crawler [processformstemplates] stopped
16:31:17,275 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [processformstemplates]
16:31:17,275 DEBUG [f.p.e.c.f.c.ElasticsearchClientManager] Closing Elasticsearch client manager
16:31:17,275 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
16:31:17,275 INFO [f.p.e.c.f.FsCrawlerImpl] FS crawler [processformstemplates] stopped
And my _settings.json..
$ cat _settings.json
{
"name" : "processformstemplates",
"fs" : {
"url" : "/usr/share/fscrawler/data",
"update_rate" : "15m",
"excludes" : [ "*/~*" ],
"json_support" : false,
"filename_as_id" : false,
"add_filesize" : true,
"remove_deleted" : true,
"add_as_inner_object" : false,
"store_source" : false,
"index_content" : true,
"attributes_support" : false,
"raw_metadata" : true,
"xml_support" : false,
"index_folders" : true,
"lang_detect" : false,
"continue_on_error" : false,
"pdf_ocr" : true,
"ocr" : {
"language" : "eng"
}
},
"elasticsearch" : {
"nodes" : [ {
"url" : "http://elasticsearch-dev.svc.cluster.local:9200"
} ],
"bulk_size" : 100,
"flush_interval" : "5s",
"byte_size" : "10mb"
},
"rest" : {
"url" : "http://127.0.0.1:8080/fscrawler"
}
}
Thanks,
Joseph