Starting Fscrawler with SSL error

Hi, Elasticsearch daemon is running and when I start Fscrawler I have this fatal error. Thanks

15:43:15,867 WARN  [f.p.e.c.f.c.ElasticsearchClient] Failed to create elasticsearch client on Elasticsearch{nodes=[https://192.168.16.200:9200], index='job_name', indexFolder='job_name_folder', bulkSize=100, flushInterval=5s, byteSize=10mb, username='null', pipeline='null', pathPrefix='null', sslVerification='true', caCertificatePath='null', pushTemplates='true'}. Message: Can not execute GET https://192.168.16.200:9200/ : Unsupported or unrecognized SSL message.
15:43:15,867 FATAL [f.p.e.c.f.c.FsCrawlerCli] We can not start Elasticsearch Client. Exiting.

Welcome!

Is Elasticsearch running with https?

bonjour David yes ES running with https

If you are using a self-signed certificate (default behavior of Elasticsearch), you might need to tell fscrawler about that.

Have a look at Elasticsearch settings — FSCrawler 2.10-SNAPSHOT documentation

Unsupported or unrecognized SSL message.

Based on that error message, it doesn't look like you're really hitting a server with SSL (https) enabled. There could be various reasons for that, but the simplest test to be sure would be to run

curl -k https://192.168.16.200:9200/

from the same server that is trying to run FSCrawler.

That should either:

  • Fail with some sort of network or SSL error, which implies SSL isn't configured the way you think it is, and we can try to work out why that is
  • Fail with "missing credentials", in which case SSL is configured, and we need to work out why FSCrawler is reporting the error that it is.

Merci David je lis tout ça
Thanks Tim yes the error message is from SSL with curl

root@localhost:/fscrawler-distribution-2.10-SNAPSHOT/bin# curl -k https://192.168.16.200:9200/
curl: (35) OpenSSL/3.2.2: error:0A0000C6:SSL routines::packet length too long

I can't set a password for Elasticsearch

root@localhost:/fscrawler-distribution-2.10-SNAPSHOT/bin# /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

ERROR: Failed to determine the health of the cluster. Unexpected http status [503], with exit code 65

:wink:

Few comments:

  • Are you running FSCrawler and Elasticsearch from a Docker instance?
  • I don't think you can run Elasticsearch as root. Did you try to start with root?

David ES is launching as root

root@localhost:/usr/share/elasticsearch# systemctl status elasticsearch.service 
● elasticsearch.service - Elasticsearch
     Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Wed 2025-02-26 01:52:04 CET; 11s ago

What about the first question?

No docker David

Could you share the Elasticsearch logs from the start?

I started ES and I don't know why the discovery using ports 9300 etc.. ?

[2025-02-26T03:15:15,898][WARN ][o.e.t.ThreadPool         ] [localhost.localdomain] failed to run scheduled task [org.elasticsearch.systemd.SystemdPlugin$$Lambda/0x00003c000148a328@2304d102] on thread pool [org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService@34f2fe18]
java.lang.WrongThreadException: Attempted access outside owning thread
        at jdk.internal.foreign.MemorySessionImpl.wrongThread(MemorySessionImpl.java:314) ~[?:?]
        at jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) ~[?:?]
        at jdk.internal.misc.ScopedMemoryAccess.copyMemory(ScopedMemoryAccess.java:131) ~[?:?]
        at java.nio.ByteBuffer.putArray(ByteBuffer.java:1361) ~[?:?]
        at java.nio.ByteBuffer.put(ByteBuffer.java:1302) ~[?:?]
        at java.nio.ByteBuffer.put(ByteBuffer.java:1338) ~[?:?]
        at org.elasticsearch.nativeaccess.Systemd.notify(Systemd.java:73) ~[elasticsearch-native-8.17.2.jar:?]
        at org.elasticsearch.nativeaccess.Systemd.notify_extend_timeout(Systemd.java:48) ~[elasticsearch-native-8.17.2.jar:?]
        at org.elasticsearch.systemd.SystemdPlugin.lambda$createComponents$0(SystemdPlugin.java:90) ~[?:?]
        at org.elasticsearch.threadpool.Scheduler$ReschedulingRunnable.doRun(Scheduler.java:224) ~[elasticsearch-8.17.2.jar:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[elasticsearch-8.17.2.jar:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-8.17.2.jar:?]
        at org.elasticsearch.threadpool.ThreadPool$1.run(ThreadPool.java:514) ~[elasticsearch-8.17.2.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1575) ~[?:?]
[2025-02-26T03:15:17,308][WARN ][o.e.n.Node               ] [localhost.localdomain] timed out after [discovery.initial_state_timeout=30s] while waiting for initial discovery state; for troubleshooting guidance see [https://www.elastic.co/guide/en/elasticsearch/reference/8.17/discovery-troubleshooting.html]
[2025-02-26T03:15:17,309][WARN ][o.e.c.c.ClusterFormationFailureHelper] [localhost.localdomain] master not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover master-eligible nodes [localhost] to bootstrap a cluster: have discovered [{localhost.localdomain}{ZLVWWMsPQXmESAcvm0a6Gw}{zlN2HE9-Thusqbo9NOU0wg}{localhost.localdomain}{192.168.16.200}{192.168.16.200:9300}{cdfhilmrstw}{8.17.2}{7000099-8521000}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{localhost.localdomain}{ZLVWWMsPQXmESAcvm0a6Gw}{zlN2HE9-Thusqbo9NOU0wg}{localhost.localdomain}{192.168.16.200}{192.168.16.200:9300}{cdfhilmrstw}{8.17.2}{7000099-8521000}] from last-known cluster state; node term 0, last-accepted version 0 in term 0; for troubleshooting guidance, see https://www.elastic.co/guide/en/elasticsearch/reference/8.17/discovery-troubleshooting.html
[2025-02-26T03:15:17,316][INFO ][o.e.h.AbstractHttpServerTransport] [localhost.localdomain] publish_address {192.168.16.200:9200}, bound_addresses {[::]:9200}
[2025-02-26T03:15:17,328][INFO ][o.e.n.Node               ] [localhost.localdomain] started {localhost.localdomain}{ZLVWWMsPQXmESAcvm0a6Gw}{zlN2HE9-Thusqbo9NOU0wg}{localhost.localdomain}{192.168.16.200}{192.168.16.200:9300}{cdfhilmrstw}{8.17.2}{7000099-8521000}{ml.allocated_processors=4, ml.allocated_processors_double=4.0, ml.max_jvm_size=2042626048, ml.config_version=12.0.0, xpack.installed=true, transform.config_version=10.0.0, ml.machine_memory=4080406528}
[2025-02-26T03:15:17,328][INFO ][o.e.n.j.JdkPosixCLibrary ] [localhost.localdomain] Sending 7 bytes to socket
[2025-02-26T03:15:20,353][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:34020}
[2025-02-26T03:15:22,844][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:34030}
[2025-02-26T03:15:24,487][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:34042}
[2025-02-26T03:15:27,315][WARN ][o.e.c.c.ClusterFormationFailureHelper] [localhost.localdomain] master not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover master-eligible nodes [localhost] to bootstrap a cluster: have discovered [{localhost.localdomain}{ZLVWWMsPQXmESAcvm0a6Gw}{zlN2HE9-Thusqbo9NOU0wg}{localhost.localdomain}{192.168.16.200}{192.168.16.200:9300}{cdfhilmrstw}{8.17.2}{7000099-8521000}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{localhost.localdomain}{ZLVWWMsPQXmESAcvm0a6Gw}{zlN2HE9-Thusqbo9NOU0wg}{localhost.localdomain}{192.168.16.200}{192.168.16.200:9300}{cdfhilmrstw}{8.17.2}{7000099-8521000}] from last-known cluster state; node term 0, last-accepted version 0 in term 0; for troubleshooting guidance, see https://www.elastic.co/guide/en/elasticsearch/reference/8.17/discovery-troubleshooting.html
[2025-02-26T03:15:28,343][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:34044}
[2025-02-26T03:15:30,485][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:49908}
[2025-02-26T03:15:32,852][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:49920}
[2025-02-26T03:15:34,758][WARN ][o.e.h.n.Netty4HttpServerTransport] [localhost.localdomain] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:49922}

something is talking HTTP to port 9200, which is listening for HTTPS.

Did your elasticsearch instance ever work? I mean, were you ever able to use curl on the (single-node) cluster? Or login on kibana? Or anything really?

Depending on how you installed it, often you get some text after first time starting ES giving you

a: the (random) elastic password it created for you
b: info on how to enable kibana
c: some other commands to later add other nodes to a cluster.

Any of that sound familiar?

Can you share your elasticsearch.yml file please.

Thanks Kevin
ES is running but fscrawler can't connect ( see above )

#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: 0.0.0.0
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts:
- 192.168.16.200:9200
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false

#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically      
# generated to configure Elasticsearch security features on 25-02-2025 12:14:54
#
# --------------------------------------------------------------------------------

# Enable security features
xpack.security.enabled: true

xpack.security.enrollment.enabled: true

# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12

# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
# Create a new cluster with the current node only
# Additional nodes can still join the cluster later
cluster.initial_master_nodes: ["localhost"]

# Allow HTTP API connections from anywhere
# Connections are encrypted and require user authentication
http.host: 0.0.0.0

# Allow other nodes to join the cluster from anywhere
# Connections are encrypted and mutually authenticated
#transport.host: 0.0.0.0

#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
"/etc/elasticsearch/elasticsearch.yml" 120L, 4052B                                                                                                                                                                                                                                             120,21        Bot

FSCrawler won't be able to connect until you solved the "curl" issue:

root@localhost:/fscrawler-distribution-2.10-SNAPSHOT/bin# curl -k https://192.168.16.200:9200/
curl: (35) OpenSSL/3.2.2: error:0A0000C6:SSL routines::packet length too long

AFAIK this seems to indicate a mix between SSL and non SSL calls...

David when I connect the url 192.168.16.200:9200 with https it's works but with the Mac the cert CA of elasticsearch is not available ( I've create another user )

This is a confusing thread.

You have setup with an auto-generated certificate, elasticsearch would have created it when it started first time, which is why you see the safari warning.

As has been said, until you figure out the curl issue, you probably can't go further. I am also not clear if you have ever used the elasticsearch cluster for anything, as you write

which is a bit of a non answer, a linux process is maybe running but is it (or has it) been used in any useful way? I am suspecting not.

Look for lines matching following in your elasticsearch log files (including the .gz files)

o.e.x.s.InitialNodeSecurityAutoConfiguration

I'm suspecting it might say you need to set the elastic password.

"Auto-configuration will not generate a password for the elastic built-in superuser, as we cannot determine if there is a terminal attached to the elasticsearch process. You can use the bin/elasticsearch-reset-password tool to set the password for the elastic user."

This a little bit depends on how you installed/setup your elasticsearch instance.

Kevin you 're right:

[2025-02-26T03:38:40,798][INFO ][o.e.x.s.InitialNodeSecurityAutoConfiguration] [localhost.localdomain] Auto-configuration will not generate a password for the elastic built-in superuser, as we cannot  determine if there is a terminal attached to the elasticsearch process. You can use the `bin/elasticsearch-reset-password` tool to set the password for the elastic user

ok when I reset password either with user elastic or another one the reply is

t@localhost:/usr/share/elasticsearch/bin# ./elasticsearch-reset-password -fv -u florent --url https://192.168.16.200:9200
Unexpected http status [401] while attempting to determine cluster health. Will retry at most 5 more times.
^Croot@localhost:/usr/share/elasticsearch/bin# ./elasticsearch-reset-password -fv -u elastic --url https://192.168.16.200:9200
Unexpected http status [401] while attempting to determine cluster health. Will retry at most 5 more times.

type or paste code here

Lucky guess :slight_smile:

Please try from directory above.

bin/elasticsearch-reset-password -i -u elastic

I am not sure your elasticsearch process is listening on interface that is 192.168.16.200, you can double check this with

lsof -i :9200

But, to be honest, assuming you never did anything useful with elasticsearch on this machine yet, I would be tempted to just stop elasticsearch, delete the data and log directory, maybe even the certs, and start again afresh.

btw, I'm also not convinced (nor sure other way) your setting

discovery.seed_hosts:
- 192.168.16.200:9200

is required here.

(check for o.e.c.c.Coordinator in your logs)