Calling esRDD in a transformation method causes Task not serializable


(Kyunam Kim) #1

Is the following not allowed because nested RDD is prohibited?

val computedRDD = anRDD.map {
    ...
    val anotherRDD = sc.esRDD(...)
   ...
}

Thanks,
Q


(Costin Leau) #2

As I've mentioned in another thread, you are being quite cryptic in your questions so it's hard to understand what the issue is.

Task not serializable it's a common issue in Spark cause by the fact that you are passing around code which is not serializable. And yes, nesting an RDD of any kind qualify as such - take a look into the Spark docs, it's easy to get around it by keeping the data point outside your code and trying to leverage as much as possible the RDD functions.
Note that an RDD is a distributed collection in the end - it acts like one but it's actually a proxy to a distributed job.


(Kyunam Kim) #3

That was a silly newbie question from me.
Now I know/learned from your response and further google searches that creating another RDD or calling actions inside a transformation is disallowed.

Thanks again,
-Q


(Costin Leau) #4

No worries. Everyone starts as a newbie.


(system) #5