Hello,
I am trying out the Open Web Crawler in preperation of whenever it replaces the existing Enterprise Search crawler and associated functionality.
My environment uses a single node Elasticsearch, Kibana and Enterprise Search for the current Enterprise Search crawler. All on v8.17.2 in a single VM. Works as expected.
I am setting up the Open Crawler in the same VM but running into an issue when trying to start a crawl.
Starting a crawl will kick out the following error:
elastic crawler]# docker exec -it crawler bin/crawler crawl config/crawler.yml
[crawl:67ee7d1ae16dda5d61cce017] [primary] Initialized an in-memory URL queue for up to 10000 URLs
[crawl:67ee7d1ae16dda5d61cce017] [primary] ES connections will be authorized with configured API key
[crawl:67ee7d1ae16dda5d61cce017] [primary] ES connections will use SSL without ca_fingerprint
The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product.
[crawl:67ee7d1ae16dda5d61cce017] [primary] Failed to reach ES at https://localhost:9200
If I run the following from the crawler container to see if I can reach the Elasticsearch port, it seems ok:
elastic crawler]# docker exec -it crawler sh
/app $ nc -zv 192.168.50.96 9200
192.168.50.96 (192.168.50.96:9200) open
If I run the same from the VM commandline, it also responds:
elastic crawler]# nc -zv 192.168.50.96 9200
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 192.168.50.96:9200.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
If I run either of these with 'localhost' instead of the IP, they still get a presumably good response:
/app $ nc -zv localhost 9200
localhost ([::1]:9200) open
elastic crawler]# nc -zv localhost 9200
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to ::1:9200.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
I have the crawler.yaml configured as below:
elasticsearch:
host: https://localhost
port: 9200
api_key: [redacted]
output_sink: elasticsearch
output_index: my-search-testindex
domains:
- url: https://contoso.com
I do have the Elasticsearch host set up with security. In the current Enterprise Search config I have a reference for...
elasticsearch.ssl.enabled: true
elasticsearch.ssl.certificate_authority: /usr/share/enterprise-search/http_ca.crt
I figure I am missing something in documentation or elsewhere. Any pointers are appreciated!