Remote access about Spark and Elasticsearch

Hi Everyone,

First, I setup the ES server on AWS by the following instructions and the
tcp port 9200, 9300 is allowed on security group:

wget
https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
sudo dpkg -i elasticsearch-1.5.2.deb

And by using sudo netstat -atnp to make sure the above ports are listening:

tcp6 0 0 :::9200 :::* LISTEN

tcp6 0 0 :::9300 :::* LISTEN

Then, my scala code:
val sparkConf = new SparkConf().setAppName("Test")
.setMaster("local[2]")
.set("es.nodes", "52.68.202.80:9200")
.set("es.nodes.discovery", "false")
val sc = new SparkContext(sparkConf)

// total 267 hits
val query = 

"{"query":{"bool":{"must":[{"range":{"scan_time":{"from":"2014-09-01T00:00:00","to":"2014-09-01T00:00:59"}}}]}}}";
val data = sc.esRDD("wifi-collection/final_data", query)
data.collection().foreach(println)

It's weird that when I run the code on laptop (localhost), I always got the
following message:

15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(jeremy); users
with modify permissions: Set(jeremy)
15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
15/05/14 00:23:37 INFO Remoting: Starting remoting
15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkDriver@orion:58665]
15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver' on
port 58665.
15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity
1966.1 MB
15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started
SocketConnector@0.0.0.0:58666
15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file
server' on port 58666.
15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040
15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on
port 4040.
15/05/14 00:23:37 INFO SparkUI: Started SparkUI at http://orion:4040
15/05/14 00:23:38 INFO Executor: Starting executor ID on host
localhost
15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver:
akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager
localhost:58667 with 1966.1 MB RAM, BlockManagerId(, localhost,
58667)
15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4
[2c62e273d2]
15/05/14 00:23:39 INFO ScalaEsRDD: Reading from [wifi-collection/final_data]
15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping
{wifi-collection=[mappings=[final_data=[bssid=STRING, gps_lat=DOUBLE,
gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE,
net_lng=DOUBLE, no=LONG, rss=DOUBLE, s_no=LONG, scan_time=DATE,
source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19)
with 5 output partitions (allowLocal=false)
15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at
App.scala:19)
15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at
RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with
curMem=0, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 1496.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with
curMem=1496, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as
bytes in memory (estimated size 1148.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
on localhost:58667 (size: 1148.0 B, free: 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block
broadcast_0_piece0
15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at
DAGScheduler.scala:839
15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage
0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
1, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed
(172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed
(172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0).
18184 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID
2, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID
0) in 301234 ms on localhost (1/5)
15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1).
19532 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID
3, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID

  1. in 301315 ms on localhost (2/5)
    15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
    15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
    15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
    15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
    15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
    15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
    15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed
    (172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed
    (172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2).
    18177 bytes result sent to driver
    15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID
    4, localhost, ANY, 3352 bytes)
    15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
    15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID
  2. in 301864 ms on localhost (3/5)
    15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3).
    18176 bytes result sent to driver
    15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID
  3. in 301882 ms on localhost (4/5)
    15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
    15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
    15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
    15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed
    (172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4).
    19190 bytes result sent to driver
    15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID
  4. in 302573 ms on localhost (5/5)
    15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
    have all completed, from pool
    15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19)
    finished in 905.680 s
    15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at
    App.scala:19, took 905.884621 s
  • Finally, after the long way, starting to foreach(println)

(AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source ->
wifiscan2, bssid -> d850e6d5a770, ssid -> jasonlan, rss -> -91.0, gps_lat
-> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng ->
-10000.0, m_lat -> 24.9916650318, m_lng -> 121.281452128, imei ->
352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014, trace -> 1))

  • and keep trying to connect

15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20) with
5 output partitions (allowLocal=false)
15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at
App.scala:20)
15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at
RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with
curMem=2644, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in
memory (estimated size 1464.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with
curMem=4108, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as
bytes in memory (estimated size 1121.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory
on localhost:58667 (size: 1121.0 B, free: 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block
broadcast_1_piece0
15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at
DAGScheduler.scala:839
15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage
1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID
5, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID
6, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request

It's weird, when I sbt package --> deploy it to the ES server -->
spark-submit --> everything is ok without waiting and error messages

I am sure I am missing some thing but unable to figure out that. >_<

Thanks for your help!!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7589e5a7-8428-4b0a-aefe-7f082e5f480a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I'm having the same problem today as well, although trying to access a
server running in a AWS VPC (runs correctly with everything on my laptop).
In my case, it appears to be trying to connect to the private IP address
before failing over to the public one (which eventually works).

Curt

On Wednesday, May 13, 2015 at 12:48:54 PM UTC-4, Jen-Ming Chung wrote:

Hi Everyone,

First, I setup the ES server on AWS by the following instructions and the
tcp port 9200, 9300 is allowed on security group:

wget
https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
sudo dpkg -i elasticsearch-1.5.2.deb

And by using sudo netstat -atnp to make sure the above ports are listening:

tcp6 0 0 :::9200 :::* LISTEN

tcp6 0 0 :::9300 :::* LISTEN

Then, my scala code:
val sparkConf = new SparkConf().setAppName("Test")
.setMaster("local[2]")
.set("es.nodes", "52.68.202.80:9200")
.set("es.nodes.discovery", "false")
val sc = new SparkContext(sparkConf)

// total 267 hits
val query = 

"{"query":{"bool":{"must":[{"range":{"scan_time":{"from":"2014-09-01T00:00:00","to":"2014-09-01T00:00:59"}}}]}}}";
val data = sc.esRDD("wifi-collection/final_data", query)
data.collection().foreach(println)

It's weird that when I run the code on laptop (localhost), I always got
the following message:

15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(jeremy); users
with modify permissions: Set(jeremy)
15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
15/05/14 00:23:37 INFO Remoting: Starting remoting
15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkDriver@orion:58665]
15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver'
on port 58665.
15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity
1966.1 MB
15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started
SocketConnector@0.0.0.0:58666
15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file
server' on port 58666.
15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040
15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on
port 4040.
15/05/14 00:23:37 INFO SparkUI: Started SparkUI at http://orion:4040
15/05/14 00:23:38 INFO Executor: Starting executor ID on host
localhost
15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver:
akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager
localhost:58667 with 1966.1 MB RAM, BlockManagerId(, localhost,
58667)
15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4
[2c62e273d2]
15/05/14 00:23:39 INFO ScalaEsRDD: Reading from
[wifi-collection/final_data]
15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping
{wifi-collection=[mappings=[final_data=[bssid=STRING, gps_lat=DOUBLE,
gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE,
net_lng=DOUBLE, no=LONG, rss=DOUBLE, s_no=LONG, scan_time=DATE,
source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19)
with 5 output partitions (allowLocal=false)
15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at
App.scala:19)
15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at
RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with
curMem=0, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 1496.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with
curMem=1496, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as
bytes in memory (estimated size 1148.0 B, free 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in
memory on localhost:58667 (size: 1148.0 B, free: 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block
broadcast_0_piece0
15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at
DAGScheduler.scala:839
15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage
0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
1, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (
172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (
172.31.14.100:9200); selected next node [52.68.202.80:9200]
15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0).
18184 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID
2, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID
0) in 301234 ms on localhost (1/5)
15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1).
19532 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID
3, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID

  1. in 301315 ms on localhost (2/5)
    15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
    15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
    15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
    15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
    15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
    15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
    15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (
    172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (
    172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2).
    18177 bytes result sent to driver
    15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID
    4, localhost, ANY, 3352 bytes)
    15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
    15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID
  2. in 301864 ms on localhost (3/5)
    15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3).
    18176 bytes result sent to driver
    15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID
  3. in 301882 ms on localhost (4/5)
    15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
    15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
    15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception
    (java.net.ConnectException) caught when processing request: Operation timed
    out
    15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
    15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed (
    172.31.14.100:9200); selected next node [52.68.202.80:9200]
    15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4).
    19190 bytes result sent to driver
    15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID
  4. in 302573 ms on localhost (5/5)
    15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
    have all completed, from pool
    15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19)
    finished in 905.680 s
    15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at
    App.scala:19, took 905.884621 s
  • Finally, after the long way, starting to foreach(println)

(AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source ->
wifiscan2, bssid -> d850e6d5a770, ssid -> jasonlan, rss -> -91.0, gps_lat
-> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng ->
-10000.0, m_lat -> 24.9916650318, m_lng -> 121.281452128, imei ->
352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014, trace -> 1))

  • and keep trying to connect

15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20)
with 5 output partitions (allowLocal=false)
15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at
App.scala:20)
15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at
RDD at AbstractEsRDD.scala:17), which has no missing parents
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with
curMem=2644, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in
memory (estimated size 1464.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with
curMem=4108, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as
bytes in memory (estimated size 1121.0 B, free 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in
memory on localhost:58667 (size: 1121.0 B, free: 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block
broadcast_1_piece0
15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at
DAGScheduler.scala:839
15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage
1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID
5, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID
6, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception
(java.net.ConnectException) caught when processing request: Operation timed
out
15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request

It's weird, when I sbt package --> deploy it to the ES server -->
spark-submit --> everything is ok without waiting and error messages

I am sure I am missing some thing but unable to figure out that. >_<

Thanks for your help!!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

As you pointed out, the issue is caused by using the AWS private IP vs the public one. the connector queries the nodes
directly and will use the IP they advertise - in AWS, typically this is the private IP. As such, after the initial
discovery (which is done using the public IP), the connector (from Spark) tries to connect using the private IP.

Note this issue is not specific to Elasticsearch but rather any service in AWS where 'direct' connections are made
within the cluster.
You can address this through the network settings as mentioned here [1]

[1] Networking | Elasticsearch Guide [8.11] | Elastic

On 5/15/15 10:33 PM, Curt Kohler wrote:

I'm having the same problem today as well, although trying to access a server running in a AWS VPC (runs correctly with
everything on my laptop). In my case, it appears to be trying to connect to the private IP address before failing over
to the public one (which eventually works).

Curt

On Wednesday, May 13, 2015 at 12:48:54 PM UTC-4, Jen-Ming Chung wrote:

Hi Everyone,

First, I setup the ES server on AWS by the following instructions and the tcp port 9200, 9300 is allowed on security
group:

wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb
<https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.5.2.deb>
sudo dpkg -i elasticsearch-1.5.2.deb

And by using sudo netstat -atnp to make sure the above ports are listening:

tcp6       0      0 :::9200                 :::*                    LISTEN
tcp6       0      0 :::9300                 :::*                    LISTEN

Then, my scala code:
     val sparkConf = new SparkConf().setAppName("Test")
       .setMaster("local[2]")
       .set("es.nodes", "52.68.202.80:9200 <http://52.68.202.80:9200>")
       .set("es.nodes.discovery", "false")
     val sc = new SparkContext(sparkConf)

     // total 267 hits
     val query =
"{\"query\":{\"bool\":{\"must\":[{\"range\":{\"scan_time\":{\"from\":\"2014-09-01T00:00:00\",\"to\":\"2014-09-01T00:00:59\"}}}]}}}";
     val data = sc.esRDD("wifi-collection/final_data", query)
     data.collection().foreach(println)

It's weird that when I run the code on laptop (localhost), I always got the following message:

15/05/14 00:23:36 INFO SparkContext: Running Spark version 1.3.1
15/05/14 00:23:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
15/05/14 00:23:37 INFO SecurityManager: Changing view acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: Changing modify acls to: jeremy
15/05/14 00:23:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view
permissions: Set(jeremy); users with modify permissions: Set(jeremy)
15/05/14 00:23:37 INFO Slf4jLogger: Slf4jLogger started
15/05/14 00:23:37 INFO Remoting: Starting remoting
15/05/14 00:23:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@orion:58665]
15/05/14 00:23:37 INFO Utils: Successfully started service 'sparkDriver' on port 58665.
15/05/14 00:23:37 INFO SparkEnv: Registering MapOutputTracker
15/05/14 00:23:37 INFO SparkEnv: Registering BlockManagerMaster
15/05/14 00:23:37 INFO DiskBlockManager: Created local directory at
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-4d9ee290-78d6-4537-8975-33886ece0b86/blockmgr-469a0ac4-e62d-4e55-b848-c3ddc5bf121f
15/05/14 00:23:37 INFO MemoryStore: MemoryStore started with capacity 1966.1 MB
15/05/14 00:23:37 INFO HttpFileServer: HTTP File server directory is
/var/folders/lz/bc5hqqsn1gvg2hl4b8svwd_w0000gn/T/spark-81548d67-dcb2-4e79-b382-79a2a0a32a76/httpd-e77bb5ee-c571-4a0b-bdab-8e8037a3205e
15/05/14 00:23:37 INFO HttpServer: Starting HTTP Server
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started SocketConnector@0.0.0.0:58666 <http://SocketConnector@0.0.0.0:58666>
15/05/14 00:23:37 INFO Utils: Successfully started service 'HTTP file server' on port 58666.
15/05/14 00:23:37 INFO SparkEnv: Registering OutputCommitCoordinator
15/05/14 00:23:37 INFO Server: jetty-8.y.z-SNAPSHOT
15/05/14 00:23:37 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
<http://SelectChannelConnector@0.0.0.0:4040>
15/05/14 00:23:37 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/05/14 00:23:37 INFO SparkUI: Started SparkUI at http://orion:4040
15/05/14 00:23:38 INFO Executor: Starting executor ID <driver> on host localhost
15/05/14 00:23:38 INFO AkkaUtils: Connecting to HeartbeatReceiver:
akka.tcp://sparkDriver@orion:58665/user/HeartbeatReceiver
15/05/14 00:23:38 INFO NettyBlockTransferService: Server created on 58667
15/05/14 00:23:38 INFO BlockManagerMaster: Trying to register BlockManager
15/05/14 00:23:38 INFO BlockManagerMasterActor: Registering block manager localhost:58667 with 1966.1 MB RAM,
BlockManagerId(<driver>, localhost, 58667)
15/05/14 00:23:38 INFO BlockManagerMaster: Registered BlockManager
15/05/14 00:23:39 INFO Version: Elasticsearch Hadoop v2.1.0.Beta4 [2c62e273d2]
15/05/14 00:23:39 INFO ScalaEsRDD: Reading from [wifi-collection/final_data]
15/05/14 00:23:39 INFO ScalaEsRDD: Discovered mapping {wifi-collection=[mappings=[final_data=[bssid=STRING,
gps_lat=DOUBLE, gps_lng=DOUBLE, imei=STRING, m_lat=DOUBLE, m_lng=DOUBLE, net_lat=DOUBLE, net_lng=DOUBLE, no=LONG,
rss=DOUBLE, s_no=LONG, scan_time=DATE, source=STRING, ssid=STRING, trace=LONG]]]} for [wifi-collection/final_data]
15/05/14 00:23:39 INFO SparkContext: Starting job: collect at App.scala:19
15/05/14 00:23:39 INFO DAGScheduler: Got job 0 (collect at App.scala:19) with 5 output partitions (allowLocal=false)
15/05/14 00:23:39 INFO DAGScheduler: Final stage: Stage 0(collect at App.scala:19)
15/05/14 00:23:39 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:23:39 INFO DAGScheduler: Missing parents: List()
15/05/14 00:23:39 INFO DAGScheduler: Submitting Stage 0 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has
no missing parents
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1496) called with curMem=0, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1496.0 B, free
1966.1 MB)
15/05/14 00:23:39 INFO MemoryStore: ensureFreeSpace(1148) called with curMem=1496, maxMem=2061647216
15/05/14 00:23:39 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1148.0 B,
free 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58667 (size: 1148.0 B,
free: 1966.1 MB)
15/05/14 00:23:39 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
15/05/14 00:23:39 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
15/05/14 00:23:39 INFO DAGScheduler: Submitting 5 missing tasks from Stage 0 (ScalaEsRDD[0] at RDD at
AbstractEsRDD.scala:17)
15/05/14 00:23:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 5 tasks
15/05/14 00:23:39 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 3352 bytes)
15/05/14 00:23:39 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/14 00:23:39 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:24:54 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:24:54 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:26:09 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:26:09 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:27:25 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:27:25 INFO HttpMethodDirector: Retrying request
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
<http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
15/05/14 00:28:40 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
<http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
15/05/14 00:28:40 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 18184 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 301234 ms on localhost (1/5)
15/05/14 00:28:40 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 19532 bytes result sent to driver
15/05/14 00:28:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, ANY, 3352 bytes)
15/05/14 00:28:40 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
15/05/14 00:28:40 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 301315 ms on localhost (2/5)
15/05/14 00:29:55 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:29:55 INFO HttpMethodDirector: Retrying request
15/05/14 00:29:56 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:29:56 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:31:11 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:31:11 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:32:26 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:32:26 INFO HttpMethodDirector: Retrying request
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
<http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
15/05/14 00:33:42 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
<http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
15/05/14 00:33:42 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 18177 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, localhost, ANY, 3352 bytes)
15/05/14 00:33:42 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
15/05/14 00:33:42 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 301864 ms on localhost (3/5)
15/05/14 00:33:42 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 18176 bytes result sent to driver
15/05/14 00:33:42 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 301882 ms on localhost (4/5)
15/05/14 00:34:58 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:34:58 INFO HttpMethodDirector: Retrying request
15/05/14 00:36:13 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:36:13 INFO HttpMethodDirector: Retrying request
15/05/14 00:37:29 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:37:29 INFO HttpMethodDirector: Retrying request
15/05/14 00:38:44 ERROR NetworkClient: Node [Operation timed out] failed (172.31.14.100:9200
<http://172.31.14.100:9200>); selected next node [52.68.202.80:9200 <http://52.68.202.80:9200>]
15/05/14 00:38:45 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 19190 bytes result sent to driver
15/05/14 00:38:45 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 302573 ms on localhost (5/5)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/05/14 00:38:45 INFO DAGScheduler: Stage 0 (collect at App.scala:19) finished in 905.680 s
15/05/14 00:38:45 INFO DAGScheduler: Job 0 finished: collect at App.scala:19, took 905.884621 s

* Finally, after the long way, starting to foreach(println)

(AU1KNAQHKoN_e7xsz2J7,Map(no -> 38344049, s_no -> 2988722, source -> wifiscan2, bssid -> d850e6d5a770, ssid ->
jasonlan, rss -> -91.0, gps_lat -> 24.99175413, gps_lng -> 121.28153416, net_lat -> -10000.0, net_lng -> -10000.0,
m_lat -> 24.9916650318, m_lng -> 121.281452128, imei -> 352842060663324, scan_time -> Mon Sep 01 08:00:06 CST 2014,
trace -> 1))

  * and keep trying to connect

15/05/14 00:38:45 INFO SparkContext: Starting job: count at App.scala:20
15/05/14 00:38:45 INFO DAGScheduler: Got job 1 (count at App.scala:20) with 5 output partitions (allowLocal=false)
15/05/14 00:38:45 INFO DAGScheduler: Final stage: Stage 1(count at App.scala:20)
15/05/14 00:38:45 INFO DAGScheduler: Parents of final stage: List()
15/05/14 00:38:45 INFO DAGScheduler: Missing parents: List()
15/05/14 00:38:45 INFO DAGScheduler: Submitting Stage 1 (ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17), which has
no missing parents
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1464) called with curMem=2644, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1464.0 B, free
1966.1 MB)
15/05/14 00:38:45 INFO MemoryStore: ensureFreeSpace(1121) called with curMem=4108, maxMem=2061647216
15/05/14 00:38:45 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1121.0 B,
free 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:58667 (size: 1121.0 B,
free: 1966.1 MB)
15/05/14 00:38:45 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
15/05/14 00:38:45 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
15/05/14 00:38:45 INFO DAGScheduler: Submitting 5 missing tasks from Stage 1 (ScalaEsRDD[0] at RDD at
AbstractEsRDD.scala:17)
15/05/14 00:38:45 INFO TaskSchedulerImpl: Adding task set 1.0 with 5 tasks
15/05/14 00:38:45 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 5, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 6, localhost, ANY, 3352 bytes)
15/05/14 00:38:45 INFO Executor: Running task 0.0 in stage 1.0 (TID 5)
15/05/14 00:38:45 INFO Executor: Running task 1.0 in stage 1.0 (TID 6)
15/05/14 00:40:00 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request:
Operation timed out
15/05/14 00:40:00 INFO HttpMethodDirector: Retrying request


It's weird, when I sbt package --> deploy it to the ES server --> spark-submit --> everything is ok without waiting
and error messages

  I am sure I am missing some thing but unable to figure out that. >_<

Thanks for your help!!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4b3a091b-00e0-4bce-95e9-00ec3bd1941e%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/55570EE2.5050400%40gmail.com.
For more options, visit https://groups.google.com/d/optout.