Esrally SSL error certificate verify failed

Hi,

I'm trying to benchmark an existing Elasticsearch cluster to try out Rally. When starting the benchmark I run into the following error:

    2020-11-10 09:27:39,430 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO Creating ES client connected to [{'host': 'elastic.test.local', 'port': 443}] with options [{'timeout': 5, 'verify_certs': False, 'use_ssl': True, 'basic_auth_user': 'elastic', 'basic_auth_password': '*****'}]
    2020-11-10 09:27:39,430 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO SSL support: on
    2020-11-10 09:27:39,438 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO SSL certificate verification: off
    2020-11-10 09:27:39,438 ActorAddr-(T|:57843)/PID:16851 esrally.client WARNING User has enabled SSL but disabled certificate verification. This is dangerous but may be ok for a benchmark. Disabling urllib warnings now to avoid a logging storm. See https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings for details.
    2020-11-10 09:27:39,438 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO SSL client authentication: off
    2020-11-10 09:27:39,438 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO HTTP basic authentication: on2020-11-10 09:27:39,438 ActorAddr-(T|:57843)/PID:16851 esrally.client INFO HTTP compression: off
    2020-11-10 09:29:45,369 ActorAddr-(T|:57843)/PID:16851 esrally.actor ERROR Error in race control
    Traceback (most recent call last):

      File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
        chunked=chunked)

      File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
        self._validate_conn(conn)

      File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
        conn.connect()

      File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 340, in connect
        ssl_context=context)

      File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 332, in ssl_wrap_socket
        return context.wrap_socket(sock, server_hostname=server_hostname)

      File "/usr/lib/python3.6/ssl.py", line 407, in wrap_socket
        _context=self, _session=session)

      File "/usr/lib/python3.6/ssl.py", line 817, in __init__
        self.do_handshake()

      File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
        self._sslobj.do_handshake()

      File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
        self._sslobj.do_handshake()

    ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)

I am using the verify_certs:false as client option described at https://esrally.readthedocs.io/en/stable/command_line_reference.html#client-options

Rally still seems to try and verify the certificate. The command i used is:

esrally race --track=pmc \
	--pipeline=benchmark-only \
	--target-hosts=elastic.test.local:443 \
	--client-options="timeout:60,use_ssl:true,verify_certs:false,basic_auth_user:'elastic',basic_auth_password:''"

Hi,

I don't see any particular indication that Rally is verifying a certificate and that is failing. I think it is failing due to something else. Did you try to connect via curl -k -u elastic:your_password_here "https://elastic.test.local:443/?pretty" to the cluster and did that work?

Daniel

Hi Daniel,

I have and that works perfectly fine:

ricardo@NB:~$ curl -k -u elastic:my_password_here "https://elastic.test.local:443/?pretty"
{
  "name" : "elastic-es-default-1",
  "cluster_name" : "elastic",
  "cluster_uuid" : "eBGjq4GdT5yKQkuAN19aPw",
  "version" : {
    "number" : "7.8.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "b5ca9c58fb664ca8bf9e4057fc229b3396bf3a89",
    "build_date" : "2020-07-21T16:40:44.668009Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Hi Ricardo,

that seems fine indeed. Thanks for checking. Can you please do the following?

# remove the old log file
rm -rf ~/.rally/logs/rally.log
# enable additional logging
export THESPLOG_FILE="${HOME}/.rally/logs/actor-system-internal.log"
export THESPLOG_FILE_MAXSIZE="204800"
export THESPLOG_THRESHOLD="INFO"
# rerun your benchmark
esrally race --track=pmc \
	--pipeline=benchmark-only \
	--target-hosts=elastic.test.local:443 \
	--client-options="timeout:60,use_ssl:true,verify_certs:false,basic_auth_user:'elastic',basic_auth_password:''"

and then share the following files:

  • ~/.rally/logs/rally.log
  • ~/.rally/logs/actor-system-internal.log

Thanks!

Daniel

Hi,

I've followed the steps, here are the logs (I don't see any attachment options for log files):

~/.rally/logs/actor-system-internal.log

2020-11-10 16:56:24.911793 p1350 I    ++++ Admin started @ ActorAddr-(T|:1900) / gen (3, 9)
2020-11-10 16:56:24.937727 p1350 I    Pending Actor request received for esrally.racecontrol.BenchmarkActor reqs {'coordinator': True} from ActorAddr-(T|:49252)
2020-11-10 16:56:24.946654 p1352 I    Starting Actor esrally.racecontrol.BenchmarkActor at ActorAddr-(T|:49254) (parent ActorAddr-(T|:1900), admin ActorAddr-(T|:1900), srcHash None)
2020-11-10 16:58:32.781013 p1343 I    ActorSystem shutdown requested.
2020-11-10 16:58:32.805614 p1350 I    ---- shutdown completed
2020-11-10 16:58:32.806355 p1343 I    ActorSystem shutdown complete.

~/.rally/logs/rally.log

2020-11-10 15:56:24,743 -not-actor-/PID:1343 esrally.rally INFO Race id [460a009b-f31b-4507-83d9-b9aed328b56e]
2020-11-10 15:56:24,744 -not-actor-/PID:1343 esrally.rally INFO OS [uname_result(system='Linux', node='NB-2020-A347', release='4.4.0-18362-Microsoft', version='#1049-Microsoft Thu Aug 14 12:01:00 PST 2020', machine='x86_64', processor='x86_64')]
2020-11-10 15:56:24,744 -not-actor-/PID:1343 esrally.rally INFO Python [namespace(_multiarch='x86_64-linux-gnu', cache_tag='cpython-36', hexversion=50727408, name='cpython', version=sys.version_info(major=3, minor=6, micro=9, releaselevel='final', serial=0))]
2020-11-10 15:56:24,744 -not-actor-/PID:1343 esrally.rally INFO Rally version [1.4.1]
2020-11-10 15:56:24,744 -not-actor-/PID:1343 esrally.utils.net INFO Connecting directly to the Internet (no proxy support).
2020-11-10 15:56:24,860 -not-actor-/PID:1343 esrally.rally INFO Detected a working Internet connection.
2020-11-10 15:56:24,861 -not-actor-/PID:1343 esrally.rally INFO Actor system already running locally? [False]
2020-11-10 15:56:24,861 -not-actor-/PID:1343 esrally.actor INFO Starting actor system with system base [multiprocTCPBase] and capabilities [{'coordinator': True, 'ip': '127.0.0.1', 'Convention Address.IPv4': '127.0.0.1:1900'}].
2020-11-10 15:56:24,912 -not-actor-/PID:1350 root INFO ++++ Actor System gen (3, 9) started, admin @ ActorAddr-(T|:1900)
2020-11-10 15:56:24,928 -not-actor-/PID:1343 esrally.racecontrol INFO User specified pipeline [benchmark-only].
2020-11-10 15:56:24,935 -not-actor-/PID:1343 esrally.racecontrol INFO Using configured hosts [{'host': 'elastic.test.local', 'port': 443}]
2020-11-10 15:56:24,936 -not-actor-/PID:1343 esrally.actor INFO Joining already running actor system with system base [multiprocTCPBase].
2020-11-10 15:56:24,938 ActorAddr-(T|:1900)/PID:1350 esrally.actor INFO Capabilities [{'coordinator': True, 'ip': '127.0.0.1', 'Convention Address.IPv4': '127.0.0.1:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 9, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1605023784893'}] match requirements [{'coordinator': True}].
2020-11-10 15:56:24,938 ActorAddr-(T|:1900)/PID:1350 esrally.actor INFO Capabilities [{'coordinator': True, 'ip': '127.0.0.1', 'Convention Address.IPv4': '127.0.0.1:1900', 'Thespian ActorSystem Name': 'multiprocTCPBase', 'Thespian ActorSystem Version': 2, 'Thespian Watch Supported': True, 'Python Version': (3, 6, 9, 'final', 0), 'Thespian Generation': (3, 9), 'Thespian Version': '1605023784893'}] match requirements [{'coordinator': True}].
2020-11-10 15:56:24,994 -not-actor-/PID:1352 elasticsearch WARNING GET https://elastic.test.local:443/_cluster/health?wait_for_nodes=%3E%3D1 [status:N/A request:0.028s]
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 340, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 332, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/usr/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py", line 238, in perform_request
    method, url, body, retries=Retry(False), headers=request_headers, **kw
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 343, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 692, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 340, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 332, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/usr/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)
2020-11-10 15:56:25,29 -not-actor-/PID:1352 elasticsearch WARNING GET https://elastic.test.local:443/_cluster/health?wait_for_nodes=%3E%3D1 [status:N/A request:0.024s]
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 340, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 332, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/usr/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py", line 238, in perform_request
    method, url, body, retries=Retry(False), headers=request_headers, **kw
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 343, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 692, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 340, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 332, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/usr/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()

2020-11-10 15:58:29,775 ActorAddr-(T|:49254)/PID:1352 esrally.actor INFO BenchmarkActor received unknown message [ActorExitRequest] (ignoring).
2020-11-10 15:58:32,775 -not-actor-/PID:1343 esrally.rally INFO Attempting to shutdown internal actor system.
2020-11-10 15:58:32,784 -not-actor-/PID:1351 root INFO ActorSystem Logging Shutdown
2020-11-10 15:58:32,805 -not-actor-/PID:1350 root INFO ---- Actor System shutdown
2020-11-10 15:58:32,806 -not-actor-/PID:1343 esrally.rally INFO Actor system is still running. Waiting...
2020-11-10 15:58:33,808 -not-actor-/PID:1343 esrally.rally INFO Shutdown completed.
2020-11-10 15:58:33,808 -not-actor-/PID:1343 esrally.rally ERROR Cannot run subcommand [race].
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/esrally/rally.py", line 714, in dispatch_sub_command
    race(cfg)
  File "/usr/local/lib/python3.6/dist-packages/esrally/rally.py", line 624, in race
    with_actor_system(racecontrol.run, cfg)
  File "/usr/local/lib/python3.6/dist-packages/esrally/rally.py", line 652, in with_actor_system
    runnable(cfg)
  File "/usr/local/lib/python3.6/dist-packages/esrally/racecontrol.py", line 365, in run
    raise e
  File "/usr/local/lib/python3.6/dist-packages/esrally/racecontrol.py", line 362, in run
    pipeline(cfg)
  File "/usr/local/lib/python3.6/dist-packages/esrally/racecontrol.py", line 79, in __call__
    self.target(cfg)
  File "/usr/local/lib/python3.6/dist-packages/esrally/racecontrol.py", line 298, in benchmark_only
    return race(cfg, external=True)
  File "/usr/local/lib/python3.6/dist-packages/esrally/racecontrol.py", line 250, in race
    raise exceptions.RallyError(result.message, result.cause)
esrally.exceptions.RallyError: ('Error in race control (ConnectionError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)) caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)))', None)

I have removed some repetitive log lines, else it would be too big for the forum.

Thanks in advance,
Ricardo

Hi,

thanks for the logs.

I spent some time trying to reproduce this locally with Rally 2.0.2 and Rally 1.4.1. (on Python 3.8) but was not successful. If Rally is not able to connect at all, we get:

ConnectionRefusedError: [Errno 61] Connection refused

If certificate verification is turned on via verify_certs:true, we get a slightly different error message:

elasticsearch.exceptions.SSLError: ConnectionError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)) caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108))

I also started looking at the CPython implementation that does the SSL handshake and without having hard evidence (but also not being able to reproduce the behavior you get) I suspect that it might be related to the SSL library because we do disable certificate verification from the application side.

Can you please upgrade to the latest Rally version 2.0.2 which also requires Python 3.8 (see our installation instructions for details)?

Daniel

Hi Daniel,

I have updated to Rally 2.0.2 with Python 3.8. It works now :slight_smile:.

Thanks for your assistance.

Ricardo

Hi Ricardo,

glad it's resolved now.

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.