FScrawler "Failed to create elasticsearch client"

Hi, Im getting an error when I tried to run FScrawler to index a pdf to elasticsearch.

Elastic version 8.9.2 on docker
OS: redhat
Firewall off
Working curl: curl --cacert /u01/ca.crt https://localhost:9200
FScrawler installed on the OS

Converted pdf to base64
base64 test-ee.pdf > docs/test-ee-base64

Configuration:

---
name: "jobs"
fs:
  url: "/home/adminsspp/enterprise-elk/docs"
  update_rate: "12h"
  excludes:
  - "*/~*"
  json_support: false
  filename_as_id: false
  add_filesize: true
  remove_deleted: true
  add_as_inner_object: false
  store_source: false
  index_content: true
  attributes_support: false
  raw_metadata: false
  xml_support: false
  index_folders: true
  lang_detect: false
  continue_on_error: false
  ocr:
    language: "eng"
    enabled: true
    pdf_strategy: "ocr_and_text"
  follow_symlinks: false
elasticsearch:
  nodes:
  - url: "http://127.0.0.1:9200"
  bulk_size: 100
  flush_interval: "5s"
  byte_size: "10mb"
  #  ssl_verification: true
  ssl_verification: false
  username: "elastic"
  password: "pass1"
  cacert: "/u01/docker/volumes/enterprise-elk_certs/_data/ca/ca.crt"

Logs on debug mode:

11:55:53,972 INFO  [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [88mb/2.9gb=2.94%], RAM [206.2mb/11.7gb=1.72%], Swap [3.3gb/3.9gb=83.01%].
11:55:53,975 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings.json] already exists
11:55:53,975 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings_folder.json] already exists
11:55:53,976 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [7/_settings.json] already exists
11:55:53,976 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [7/_settings_folder.json] already exists
11:55:53,977 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [7/_wpsearch_settings.json] already exists
11:55:53,977 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [8/_settings.json] already exists
11:55:53,978 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [8/_settings_folder.json] already exists
11:55:53,978 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [8/_wpsearch_settings.json] already exists
11:55:53,978 DEBUG [f.p.e.c.f.c.FsCrawlerCli] Starting job [jobs]...
11:55:54,486 INFO  [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
11:55:54,487 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
11:55:54,604 WARN  [f.p.e.c.f.c.ElasticsearchClient] We are not doing SSL verification. It's not recommended for production.
11:55:54,659 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version
11:55:55,273 WARN  [f.p.e.c.f.c.ElasticsearchClient] Failed to create elasticsearch client on Elasticsearch{nodes=[http://127.0.0.1:9200], index='jobs', indexFolder='jobs_folder', bulkSize=100, flushInterval=5s, byteSize=10mb, username='elastic', pipeline='null', pathPrefix='null', sslVerification='false'}. Message: Can not execute GET http://127.0.0.1:9200/ : Unexpected end of file from server.
11:55:55,274 FATAL [f.p.e.c.f.c.FsCrawlerCli] We can not start Elasticsearch Client. Exiting.
fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClientException: Can not execute GET http://127.0.0.1:9200/ : Unexpected end of file from server
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.httpCall(ElasticsearchClient.java:795) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.httpGet(ElasticsearchClient.java:744) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.getVersion(ElasticsearchClient.java:233) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.start(ElasticsearchClient.java:196) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.service.FsCrawlerManagementServiceElasticsearchImpl.start(FsCrawlerManagementServiceElasticsearchImpl.java:65) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.FsCrawlerImpl.start(FsCrawlerImpl.java:116) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawlerCli.startEsClient(FsCrawlerCli.java:407) [fscrawler-cli-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawlerCli.runner(FsCrawlerCli.java:383) [fscrawler-cli-2.10-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawlerCli.main(FsCrawlerCli.java:119) [fscrawler-cli-2.10-SNAPSHOT.jar:?]
Caused by: jakarta.ws.rs.ProcessingException: java.net.SocketException: Unexpected end of file from server
        at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:275) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:300) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$1(JerseyInvocation.java:675) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.call(JerseyInvocation.java:697) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.lambda$runInScope$3(JerseyInvocation.java:691) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:292) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:274) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:205) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:390) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.runInScope(JerseyInvocation.java:691) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:674) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:422) ~[jersey-client-3.1.3.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.httpCall(ElasticsearchClient.java:769) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        ... 8 more
Caused by: java.net.SocketException: Unexpected end of file from server
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:862) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:859) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1604) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1509) ~[?:?]
        at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527) ~[?:?]
        at org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:415) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:273) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:300) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$1(JerseyInvocation.java:675) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.call(JerseyInvocation.java:697) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.lambda$runInScope$3(JerseyInvocation.java:691) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:292) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:274) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:205) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:390) ~[jersey-common-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.runInScope(JerseyInvocation.java:691) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:674) ~[jersey-client-3.1.3.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:422) ~[jersey-client-3.1.3.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClient.httpCall(ElasticsearchClient.java:769) ~[fscrawler-elasticsearch-client-2.10-SNAPSHOT.jar:?]
        ... 8 more
11:55:55,282 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [jobs]
11:55:55,282 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
11:55:55,287 DEBUG [f.p.e.c.f.s.FsCrawlerManagementServiceElasticsearchImpl] Elasticsearch Management Service stopped
11:55:55,287 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
11:55:55,287 DEBUG [f.p.e.c.f.s.FsCrawlerDocumentServiceElasticsearchImpl] Elasticsearch Document Service stopped
11:55:55,287 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
11:55:55,287 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [jobs] stopped
11:55:55,293 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [jobs]

Any suggestion?

Hey

I think I need to implement this TODO:

cacert is not a supported option I think.
But read this: Elasticsearch settings — FSCrawler 2.10-SNAPSHOT documentation

This might help.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.