Hi, I am trying to learn about Elastic and fscrawler so I can catalogue and search my pdf’s on a win7 machine. I am going up a very steep learning curve haha
I have installed elastic version 6.4.2 using the suggested installer and its working “as a service” and accessing via http://127.0.0.1:9202, I get for name:”MESH-PC” the following
|cluster_name|"elasticsearch"|
|---|---|
| cluster_uuid|"GhGmJvzKTwK_uktjKa3vjQ"|
| version||
| number|"6.4.2"|
| build_flavor|"unknown"|
| build_type|"unknown"|
| build_hash|"04711c2"|
| build_date|"2018-09-26T13:34:09.098244Z"|
| build_snapshot|false|
| lucene_version|"7.4.0"|
| minimum_wire_compatibility_version|"5.6.0"|
| minimum_index_compatibility_version|"5.0.0"|
| tagline|"You Know, for Search"|
I am using port 9202 as 9200 is being used (information from ipconfig)
I have had numerous problems:
1.I set-up c:\tmp\es manually but changing .json gives an invalid code message when I replace “/tmp/es” with “c:\tmp\es”
2.The fscrawler message said that I would find the .json file in c:/users/david/.fscrawler/{job_name}/_settings.json but it is the a folder called “c:/vtroot”
I am also unable to create elasticsearch client, see trace below
C:\Program Files\fscrawler-2.5\bin>fscrawler --trace mesh-pc
21:06:23,662 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [2/_settings.json] already exists
21:06:23,667 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [2/_settings_folder.json] already exists
21:06:23,668 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [5/_settings.json] already exists
21:06:23,669 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [5/_settings_folder.json] already exists
21:06:23,671 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings.json] already exists
21:06:23,672 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings_folder.json] already exists
21:06:23,674 DEBUG [f.p.e.c.f.c.FsCrawler] Starting job [mesh-pc]...
21:06:24,171 TRACE [f.p.e.c.f.c.FsCrawler] settings used for this crawler: [{
"name" : "MESH-PC",
"fs" : {
"url" : "/tmp/es",
"update_rate" : "15m",
"excludes" : [ "*/~*" ],
"json_support" : false,
"filename_as_id" : false,
"add_filesize" : true,
"remove_deleted" : true,
"add_as_inner_object" : false,
"store_source" : false,
"index_content" : true,
"attributes_support" : false,
"raw_metadata" : true,
"xml_support" : false,
"index_folders" : true,
"lang_detect" : false,
"continue_on_error" : false,
"pdf_ocr" : true,
"ocr" : {
"language" : "eng"
}
},
"elasticsearch" : {
"nodes" : [ {
"host" : "127.0.0.1",
"port" : 9202,
"scheme" : "HTTP"
} ],
"bulk_size" : 100,
"flush_interval" : "5s",
"byte_size" : "10mb"
},
"rest" : {
"scheme" : "HTTP",
"host" : "127.0.0.1",
"port" : 8080,
"endpoint" : "fscrawler"
}
}]
21:06:25,593 WARN [f.p.e.c.f.c.ElasticsearchClientManager] failed to create elasticsearch client, disabling crawler...
21:06:25,593 FATAL [f.p.e.c.f.c.FsCrawler] Fatal error received while running the crawler: [Permission denied: no further information]
21:06:25,594 DEBUG [f.p.e.c.f.c.FsCrawler] error caught
java.io.IOException: Permission denied: no further information
at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:728) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:198) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:522) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.info(RestHighLevelClient.java:283) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.setElasticsearchBehavior(ElasticsearchClient.java:291) ~[fscrawler-elasticsearch-client-2.5.jar:?]
at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClientManager.start(ElasticsearchClientManager.java:90) ~[fscrawler-elasticsearch-client-2.5.jar:?]
at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawler.main(FsCrawler.java:260) [fscrawler-cli-2.5.jar:?]
Caused by: java.net.SocketException: Permission denied: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_191]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_191]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171) ~[httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) ~[httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[httpcore-nio-4.4.5.jar:4.4.5]
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[httpasyncclient-4.1.2.jar:4.1.2]
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[httpasyncclient-4.1.2.jar:4.1.2]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_191]
21:06:25,621 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [MESH-PC]
21:06:25,622 DEBUG [f.p.e.c.f.c.ElasticsearchClientManager] Closing Elasticsearch client manager
21:06:25,674 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing REST client
21:06:25,710 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
21:06:25,712 INFO [f.p.e.c.f.FsCrawlerImpl] FS crawler [MESH-PC] stopped
21:06:25,718 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [MESH-PC]
21:06:25,720 DEBUG [f.p.e.c.f.c.ElasticsearchClientManager] Closing Elasticsearch client manager
21:06:25,720 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing REST client
21:06:25,721 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
I would appreciate some pointers