Elasstic search insertion failing

While inserting the data into elastic search using spark am getting the following exception.
I am completely blocked here ,Can anybody help here

Setting the following properties to ES
"es.nodes"
es.net.http.auth.user"
"es.net.http.auth.pass"
"es.write.operation" = "upsert"
"es.batch.write.retry.count" = "10"
"es.mapping.id"

Job aborted due to stage failure: Task 3469 in stage 9351.0 failed 4 times, most recent failure: Lost task 3469.3 in stage 9351.0 (TID 130002): org.elasticsearch.hadoop.rest.EsHadoopTransportException: java.net.BindException: Address already in use (Bind failed)

at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:129)

at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:461)

at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:425)

at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:429)

at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:155)

at org.elasticsearch.hadoop.rest.RestClient.getHttpNodes(RestClient.java:112)

at org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:92)

at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:574)

at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)

at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:80)

at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:80)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

at org.apache.spark.scheduler.Task.run(Task.scala:89)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.net.BindException: Address already in use (Bind failed)

at java.net.PlainSocketImpl.socketBind(Native Method)

at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)

at java.net.Socket.bind(Socket.java:644)

at java.net.Socket.(Socket.java:433)

at java.net.Socket.(Socket.java:286)

at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)

at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)

at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)

at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)

at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)

at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)

at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)

at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:478)

at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:112)

... 16 more

Driver stacktrace:

1 Like

I'm assuming you are using Spark Streaming. If so, make sure that you are using the native support for the DStream api in the connector as the native support enables connection pooling of resources on your Spark connector. The underlying issue is that your Spark executors are opening and closing HTTP connections at a rate higher than the underlying operating system can clean them up. If you are not using spark streaming, take a look at adjusting the number of your concurrent tasks to get the connection count under control.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.