Hello,
around 1/3 of our files fail to write a DataFrame into Elasticsearch from our Spark-Application while using the ElasticSearch-Spark-Connector due to the following exception:
org.elasticsearch.hadoop.rest.EsHadoopTransportException: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
Certificates seem to be correct, else it would fail 100% of the time..?!
We use Elasticsearch 6.1.3 and the "Elasticsearch Spark (for Spark 2.0) » 6.1.3"-Connector
The full Exception:
¡error while spark execution=Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 7, xxx.tb.de, executor 1): org.elasticsearch.hadoop.rest.EsHadoopTransportException: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:124)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:466)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:430)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:155)
at org.elasticsearch.hadoop.rest.RestClient.getHttpNodes(RestClient.java:112)
at org.elasticsearch.hadoop.rest.RestClient.getHttpDataNodes(RestClient.java:129)
at org.elasticsearch.hadoop.rest.InitializationUtils.filterNonDataNodesIfNeeded(InitializationUtils.java:157)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:581)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:757)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at org.apache.commons.httpclient.HttpConnection.flushRequestOutputStream(HttpConnection.java:828)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2116)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:478)
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:112)
... 17 more
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at sun.security.ssl.InputRecord.read(InputRecord.java:505)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
... 31 more
787a89a49cfc¡cause=org.elasticsearch.hadoop.rest.EsHadoopTransportException: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
Executer logs:
Executor task launch worker for task 3, READ: TLSv1.2 Handshake, length = 333
Executor task launch worker for task 3, READ: TLSv1.2 Handshake, length = 4
Executor task launch worker for task 3, WRITE: TLSv1.2 Handshake, length = 70
Executor task launch worker for task 3, WRITE: TLSv1.2 Change Cipher Spec, length = 1
Executor task launch worker for task 3, WRITE: TLSv1.2 Handshake, length = 64
Executor task launch worker for task 3, READ: TLSv1.2 Change Cipher Spec, length = 1
Executor task launch worker for task 3, READ: TLSv1.2 Handshake, length = 64
%% Cached client session: [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
Executor task launch worker for task 3, WRITE: TLSv1.2 Application Data, length = 256
Executor task launch worker for task 3, READ: TLSv1.2 Application Data, length = 2640
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
Executor task launch worker for task 3, SEND TLSv1.2 ALERT: warning, description = close_notify
Executor task launch worker for task 3, WRITE: TLSv1.2 Alert, length = 48
Executor task launch worker for task 3, called closeSocket(true)
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
Executor task launch worker for task 3, WRITE: TLSv1.2 Handshake, length = 209
Executor task launch worker for task 3, received EOFException: error
Executor task launch worker for task 3, handling exception: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
Executor task launch worker for task 3, SEND TLSv1.2 ALERT: fatal, description = handshake_failure
Executor task launch worker for task 3, WRITE: TLSv1.2 Alert, length = 2
Executor task launch worker for task 3, Exception sending alert: java.net.SocketException: Broken pipe (Write failed)
Executor task launch worker for task 3, called closeSocket()
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
Executor task launch worker for task 3, called close()
Executor task launch worker for task 3, called closeInternal(true)
The "worker for task 3" is closing the connection before trying to WRITE/handshake(?) again, which causes an exception. Sometimes it works, sometimes it doesnt
We already tried setting all environments to TSL1, TSL1.1 and are currently using TSL1.2
Anyone any ideas? Thanks in advance!