Starting Fscrawler with SSL error

Good catch. It should be on port 9300 and not 9200 anyway.

Both ports 9200 & 9300 are listening ES

root@localhost:~# lsof -i :9200
COMMAND   PID          USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    11969 elasticsearch  540u  IPv6 828161      0t0  TCP *:wap-wsp (LISTEN)
root@localhost:~# lsof -i :9300
COMMAND   PID          USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    11969 elasticsearch  538u  IPv6 831121      0t0  TCP *:vrace (LISTEN)

Did you a) remove the discovery.seed_hosts setting and restart and after that b) try to set the elastic user password using command suggested, if so what is the output now ?

I meant that this is wrong:

discovery.seed_hosts:
- 192.168.16.200:9200

And should be (if needed to be set):

discovery.seed_hosts:
- 192.168.16.200:9300

i delete the discovery seed hosts and re-trying to reset the password..nada

root@localhost:/usr/share/elasticsearch# bin/elasticsearch-reset-password -i -u elastic

ERROR: Failed to determine the health of the cluster. Unexpected http status [503], with exit code 65

when you ran that elasticsearch-reset-password command, something was likely logged in the elasticsearch.log file. tail the log file as you run the command.

But also, look at this (my elasticsearch process is running listening for HTTPS on 9200)

# bin/elasticsearch-reset-password -i -u elastic --url http://localhost:9200
ERROR: Failed to determine the health of the cluster. , with exit code 69

# bin/elasticsearch-reset-password -i -u elastic --url https://localhost:9200
This tool will reset the password of the [elastic] user.
You will be prompted to enter the password.
Please confirm that you would like to continue [y/N]

so please give the second variant a try too.

Kevin I try with port 9300

root@localhost:/usr/share/elasticsearch# bin/elasticsearch-reset-password -i -u elastic --url https://192.168.16.200:9300
12:17:51.584 [main] WARN  org.elasticsearch.common.ssl.DiagnosticTrustManager - failed to establish trust with server at [192.168.16.200]; the server provided a certificate with subject name [CN=localhost], fingerprint [896248a831fa6e8dc31be9fa04fb41916966a002], no keyUsage and no extendedKeyUsage; the certificate is valid between [2025-02-25T12:14:56Z] and [2124-02-02T12:14:56Z] (current time is [2025-02-26T11:17:51.583590580Z], certificate dates are valid); the session uses cipher suite [TLS_AES_256_GCM_SHA384] and protocol [TLSv1.3]; the certificate does not have any subject alternative names; the certificate is issued by [CN=Elasticsearch security auto-configuration transport CA]; the certificate is signed by (subject [CN=Elasticsearch security auto-configuration transport CA] fingerprint [de176ff0d5279d0943f87f734356c378d567697e]) which is self-issued; the [CN=Elasticsearch security auto-configuration transport CA] certificate is not trusted in this ssl context ([xpack.security.http.ssl (with trust configuration: Composite-Trust{JDK-trusted-certs,StoreTrustConfig{path=certs/http.p12, password=<non-empty>, type=PKCS12, algorithm=PKIX}})])

So I think like David the CA provides by ES
is not good ( date expired)

I suggested

# bin/elasticsearch-reset-password -i -u elastic --url https://localhost:9200

you executed

root@localhost:/usr/share/elasticsearch# bin/elasticsearch-reset-password -i -u elastic --url https://192.168.16.200:9300

Spot the difference?

btw the response has "certificate dates are valid" in it.

Also

failed to establish trust with server at [192.168.16.200]

but the response has

"with subject name [CN=localhost]"

Please try again with the command I actually suggested.

The certificate is self (and auto) generated, its generation is probably in your logs somewhere, which would have been done at first startup of elasticsearch.

Thanks Kevin
with reset pwd localhost 9200

root@localhost:/usr/share/elasticsearch# bin/elasticsearch-reset-password -i -u elastic --url https://localhost:9200

ERROR: Failed to determine the health of the cluster. Unexpected http status [503], with exit code 65

with port 9300 CA with invalid dates !

port 9300 is typically used for internal cluster communication. You should not yourself have to use it, not at this point anyways. Essentially, forget it for now. You should typically communicate with elasticsearch on port 9200.

Your self-signed, auto-generated certificate is probably fine. It has valid dates (they are shown above). You can look more closely with

openssl s_client -connect localhost:9200 < /dev/null

here's mine,

Certificate chain
 0 s:CN = u2024
   i:CN = Elasticsearch security auto-configuration HTTP CA
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA256
   v:NotBefore: Feb 19 23:44:20 2025 GMT; NotAfter: Feb 19 23:44:20 2027 GMT
 1 s:CN = Elasticsearch security auto-configuration HTTP CA
   i:CN = Elasticsearch security auto-configuration HTTP CA
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA256
   v:NotBefore: Feb 19 23:44:18 2025 GMT; NotAfter: Feb 19 23:44:18 2028 GMT

But I can't figure out whats wrong with your one-node cluster right now. so, if there is no useful data in it, I return to a suggestion I made a few posts ago, which is clean up and start again:

  1. stop elasticsearch
  2. remove content of your data directory, log directory, and certs
  3. in short, return to a known clean state
  4. check your elasticsearch,yml, paste here if unsure
  5. start elasticsearch
  6. you should see in elasticsearch.log file that you need to now set the elastic password
  7. check its running and listening on port 9200 via lsof -i :9200
  8. use the command
    bin/elasticsearch-reset-password -i -u elastic --url https://localhost:9200

hi kevin
i made clean install of ES
ES listen to the port 9200 in TCP
I have the pwd generate by ES
but curl says:

typroot@localhost:/usr/share/elasticsearch# curl -k https://192.168.16.200:9200 
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Bearer realm=\"security\"","ApiKey","Basic realm=\"security\", charset=\"UTF-8\""]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Bearer realm=\"security\"","ApiKey","Basic realm=\"security\", charset=\"UTF-8\""]}},"status":401}e or paste code here

ssl config:

root@localhost:/usr/share/elasticsearch# openssl s_client -connect localhost:9200 < /dev/null
Connecting to ::1
CONNECTED(00000003)
Can't use SSL_get_servername
depth=1 CN=Elasticsearch security auto-configuration HTTP CA
verify error:num=19:self-signed certificate in certificate chain
verify return:1
depth=1 CN=Elasticsearch security auto-configuration HTTP CA
verify return:1
depth=0 CN=localhost
verify return:1
---
Certificate chain
 0 s:CN=localhost
   i:CN=Elasticsearch security auto-configuration HTTP CA
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA256
   v:NotBefore: Mar  4 15:05:15 2025 GMT; NotAfter: Mar  4 15:05:15 2027 GMT
 1 s:CN=Elasticsearch security auto-configuration HTTP CA
   i:CN=Elasticsearch security auto-configuration HTTP CA
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA256
   v:NotBefore: Mar  4 15:05:14 2025 GMT; NotAfter
1 Like

You are actually (probably) good.

You just need add some args to curl

EUSER="elastic"
EPASS="put pwd generated by ES here"
EHOST="localhost" 
EPORT="9200"

or maybe EHOST=192.168.16.200 in your case, but it maybe does not matter.

and then

curl -s -k -u "${EUSER}":"${EPASS}"  "https://${EHOST}:${EPORT}/"

fingers crossed ...

should see something similar to

{
  "name" : "your-nodename",
  "cluster_name" : "your-cluster-name",
  "cluster_uuid" : "aN.....Lg",
  "version" : {
    "number" : "8.17.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "747663ddda3421467150de0e4301e8d4bc636b0c",
    "build_date" : "2025-02-05T22:10:57.067596412Z",
    "build_snapshot" : false,
    "lucene_version" : "9.12.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

thanks for your help Kevin
Alway an error:

root@localhost:~# curl -s -k -u elastic:thegoodpwdfromES https://192.168.16.200:9200
{"error":{"root_cause":[{"type":"status_exception","reason":"Cluster state has not been recovered yet, cannot write to the [null] index"}],"type":"authentication_processing_error","reason":"failed to promote the auto-configured elastic password hash","caused_by":{"type":"status_exception","reason":"Cluster state has not been recovered yet, cannot write to the [null] index"}},"status":503}

with -v I don't see an error with SSL, ES is very difficult to initialize

root@localhost:/etc/ssl# curl -s -k -v -u elastic:PpeYIOLJ091DUfVRdN4c https://192.168.16.200:9200
*   Trying 192.168.16.200:9200...
* Connected to 192.168.16.200 (192.168.16.200) port 9200
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / x25519 / RSASSA-PSS
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=localhost
*  start date: Mar  4 15:05:15 2025 GMT
*  expire date: Mar  4 15:05:15 2027 GMT
*  issuer: CN=Elasticsearch security auto-configuration HTTP CA
*  SSL certificate verify result: self-signed certificate in certificate chain (19), continuing anyway.
*   Certificate level 0: Public key type RSA (4096/152 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (4096/152 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/1.x
* Server auth using Basic with user 'elastic'
> GET / HTTP/1.1
> Host: 192.168.16.200:9200
> Authorization: Basic ZWxhc3RpYzpQcGVZSU9MSjA5MURVZlZSZE40Yw==
> User-Agent: curl/8.9.1
> Accept: */*
> 
* Request completely sent off
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/1.1 503 Service Unavailable
< X-elastic-product: Elasticsearch
< content-type: application/json
< content-length: 391
< 
* Connection #0 to host 192.168.16.200 left intact

I wonder if this is now a permissions issue.

btw, you dont need to be root user to run the curl command. Usually elasticsearch runs as a non-root user, maybe after your tidy up some file/directory has ended up as being owned by root.

See:

$ ls -ld /var/lib/elasticsearch/
drwxr-s--- 5 elasticsearch elasticsearch 4096 Mar  4 21:28 /var/lib/elasticsearch/

so this whole tree is owned by the "elasticsearch" user (this is Ubuntu)

$ sudo find /var/lib/elasticsearch/ /var/log/elasticsearch/ \! -user elasticsearch -ls
$

no output, everything in those trees is owned by elastsearch user.

$ sudo find /var/lib/elasticsearch/ -user elasticsearch -type f | wc
   2878    2878  206668
$ sudo find /var/log/elasticsearch/ -user elasticsearch -type f | wc
     47      47    2160
$

The error you have now has been seen before, eg

and there its not clear what was wrong but the summary was:

This means that your cluster is not in a state where it can read/write data. I cannot tell you what caused that, but it almost certainly has nothing to do with your password setup - the root of the problem is that security cannot do its job if it cannot read and write data from the cluster.

and the suggestion was to check cluster.initial_master_nodes setting. as you want just a single node cluster then

cluster.initial_master_nodes: []

should be fine.

yes Kevin the problem was effectually cluster.initial_master_nodes
cluster.initial_master_nodes: ["192.168.16.200"] is working with curl but with option -k
so with fscrawler I have the error from David (!)

fr.pilato.elasticsearch.crawler.fs.client.ElasticsearchClientException: Can not execute GET https://192.168.16.200:9200/ : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

Read my first answer: Starting Fscrawler with SSL error - #4 by dadoonet

That should help.

1 Like

Merci David U're right !

Fscrawler is working with ssl_verification: false and username & password in .yaml
What is the error for crawling in _status.json
Thanks one more ( and more ..) time !

17:58:08,346 INFO  [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
17:58:08,346 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
17:58:08,441 WARN  [f.p.e.c.f.c.ElasticsearchClient] We are not doing SSL verification. It's not recommended for production.
17:58:08,705 INFO  [f.p.e.c.f.c.ElasticsearchClient] Elasticsearch Client connected to a node running version 8.17.3
17:58:08,709 WARN  [f.p.e.c.f.c.ElasticsearchClient] Semantic search is enabled but we are running Elasticsearch with a basic license although we need either an enterprise or trial license.We will not be able to use the semantic search features ATM. We might switch later to a vector embeddings generation.
17:58:08,712 WARN  [f.p.e.c.f.c.ElasticsearchClient] We are not doing SSL verification. It's not recommended for production.
17:58:08,751 INFO  [f.p.e.c.f.c.ElasticsearchClient] Elasticsearch Client connected to a node running version 8.17.3
17:58:08,755 WARN  [f.p.e.c.f.c.ElasticsearchClient] Semantic search is enabled but we are running Elasticsearch with a basic license although we need either an enterprise or trial license.We will not be able to use the semantic search features ATM. We might switch later to a vector embeddings generation.
17:58:08,804 INFO  [f.p.e.c.f.FsParserAbstract] FS crawler started for [job_name] for [/home/admin/Downloads] every [15m]
17:58:08,833 INFO  [f.p.e.c.f.t.TikaInstance] OCR is disabled.
17:58:09,556 WARN  [f.p.e.c.f.FsParserAbstract] Error while crawling /home/admin/Downloads: /home/admin/.fscrawler/job_name/_status.json
17:58:09,556 INFO  [f.p.e.c.f.FsParserAbstract] Closing FS crawler file abstractor [FileAbstractorFile].

1 Like

Could you share your _settings.yml file?
And also run it with:

FS_JAVA_OPTS="-DLOG_LEVEL=trace" bin/fscrawler 

ok this is _settings.yaml

name: "job_name"
fs:
  url: "/home/admin/Downloads"
  update_rate: "15m"
  excludes:
  - "*/~*"
  - "/appdata/*"
  - "/domains/*"
  - "/isos/*"
  json_support: false
  filename_as_id: false
  add_filesize: true
  remove_deleted: true
  add_as_inner_object: false
  store_source: false
  index_content: true
  attributes_support: false
  raw_metadata: false
  xml_support: false
  index_folders: true
  lang_detect: false
  continue_on_error: false
  ocr:
    language: "eng"
    enabled: false
    pdf_strategy: "ocr_and_text"
  follow_symlinks: false
elasticsearch:
  nodes:
  - url: "https://192.168.16.200:9200"
  bulk_size: 100
  flush_interval: "5s"
  byte_size: "10mb"
  ssl_verification: false
  username: "elastic"
  password: "x"                                      

and with debug (it seems that FScrawler is indexing job_name !)

19:37:26,066 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Loading plugins
19:37:26,079 INFO  [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [46.3mb/974mb=4.76%], RAM [269.2mb/3.7gb=6.92%], Swap [3.7gb/3.7gb=98.23%].
19:37:26,079 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings.json] already exists
19:37:26,079 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Mapping [6/_settings_folder.json] already exists
19:37:26,079 INFO  [f.console] No job specified. Here is the list of existing jobs:
19:37:26,080 DEBUG [f.p.e.c.f.c.FsCrawlerJobsUtil] Adding job [job_name]
19:37:26,080 DEBUG [f.p.e.c.f.c.FsCrawlerJobsUtil] Ignoring [_default] dir as no settings file has been found
19:37:26,080 INFO  [f.console] [1] - job_name
19:37:26,080 INFO  [f.console] Choose your job [1-1]...
1
19:37:34,909 DEBUG [f.p.e.c.f.c.FsCrawlerCli] Starting job [job_name]...
19:37:34,911 TRACE [f.p.e.c.f.f.MetaFileHandler] Reading file _settings.yaml from /home/admin/.fscrawler/job_name
19:37:35,033 WARN  [f.p.e.c.f.s.Elasticsearch] username is deprecated. Use apiKey instead.
19:37:35,033 WARN  [f.p.e.c.f.s.Elasticsearch] password is deprecated. Use apiKey instead.
19:37:35,048 TRACE [f.p.e.c.f.c.FsCrawlerCli] settings used for this crawler: [---
name: "job_name"
fs:
  url: "/home/admin/Downloads"
  update_rate: "15m"
  excludes:
  - "*/~*"
  - "/appdata/*"
  - "/domains/*"
  - "/isos/*"
  json_support: false
  filename_as_id: false
  add_filesize: true
  remove_deleted: true
  add_as_inner_object: false
  store_source: false
  index_content: true
  attributes_support: false
  raw_metadata: false
  xml_support: false
  index_folders: true
  lang_detect: false
  continue_on_error: false
  ocr:
    language: "eng"
    enabled: false
    pdf_strategy: "ocr_and_text"
  follow_symlinks: false
elasticsearch:
  nodes:
  - url: "https://192.168.16.200:9200"
  bulk_size: 100
  flush_interval: "5s"
  byte_size: "10mb"
  username: "elastic"
  ssl_verification: false
  push_templates: true
  semantic_search: true
]
19:37:35,048 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Starting plugins
19:37:35,056 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [http]
19:37:35,056 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [local]
19:37:35,056 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [s3]
19:37:35,061 DEBUG [f.p.e.c.f.FsParserAbstract] creating fs crawler thread [job_name] for [/home/admin/Downloads] every [15m]
19:37:35,061 INFO  [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
19:37:35,061 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
19:37:35,107 WARN  [f.p.e.c.f.c.ElasticsearchClient] We are not doing SSL verification. It's not recommended for production.
19:37:35,122 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version
19:37:35,122 TRACE [f.p.e.c.f.c.ElasticsearchClient] Calling GET https://192.168.16.200:9200/ with params []
19:37:35,433 TRACE [f.p.e.c.f.c.ElasticsearchClient] GET https://192.168.16.200:9200/ gives {
  "name" : "localhost.localdomain",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "yj0Sbx9YSvSKleqsM8PTOw",
  "version" : {
    "number" : "8.17.3",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "a091390de485bd4b127884f7e565c0cad59b10d2",
    "build_date" : "2025-02-28T10:07:26.089129809Z",
    "build_snapshot" : false,
    "lucene_version" : "9.12.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

19:37:35,438 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version returns 8.17.3 and 8 as the major version number
19:37:35,438 INFO  [f.p.e.c.f.c.ElasticsearchClient] Elasticsearch Client connected to a node running version 8.17.3
19:37:35,438 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Semantic search is enabled and we are running on a version of Elasticsearch 8.17.3 which is 8.17 or higher. We will try to use the semantic search features.
19:37:35,438 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get license
19:37:35,438 TRACE [f.p.e.c.f.c.ElasticsearchClient] Calling GET https://192.168.16.200:9200/_license with params []
19:37:35,441 TRACE [f.p.e.c.f.c.ElasticsearchClient] GET https://192.168.16.200:9200/_license gives {
  "license" : {
    "status" : "active",
    "uid" : "5b7e420d-85bd-4dc7-b688-d252a4bff117",
    "type" : "basic",
    "issue_date" : "2025-03-06T13:40:28.634Z",
    "issue_date_in_millis" : 1741268428634,
    "max_nodes" : 1000,
    "max_resource_units" : null,
    "issued_to" : "elasticsearch",
    "issuer" : "elasticsearch",
    "start_date_in_millis" : -1
  }
}