Multiple Elastic Queries per spark job

mattmattmatt1 · May 17, 2017, 1:51pm

HI All,

I've got a spark job that starts with a single elastic query. I want to use the result of this in a new query within the same spark job to pull down more data, based on what the first one brings back.

I've attempted to do this using the following code;

  val conf = new Conf()
  conf.set("es.query", query)

  val sc = new SparkContext(conf)

 // SPARK STUFF TO DETERMINE SECOND ELASTIC QUERY

 val newElasticQuery = ...
 val newConf = new Conf()
 newConf.set("es.query", newElasticQuery)

 val newSparkContext = SparkContext.getOrCreate(newConf)

The issue I have is this does not use the new query but the original conf, so the same data is pulled.

How would you go about doing this?

Cheers!

james.baiera · May 17, 2017, 7:28pm

You can only have one SparkContext active per JVM, so your last call to SparkContext.getOrCreate is getting the previously created spark context. You will need to specify these settings on an RDD by RDD basis. You should be able to pass those settings to the RDD create call via a Map.

mattmattmatt1 · May 18, 2017, 9:46am

thanks for you reply! That makes sense, I misunderstood what getorCreate was actually doing.....

I'm still having trouble, can you give me an example of how you would apply this via a Map?

james.baiera · May 18, 2017, 1:43pm

The last example in the scala portion of this section in the docs has an example of the syntax:

EsSpark.saveToEs(rdd, "index/type", Map("setting" -> "value"))

mattmattmatt1 · May 19, 2017, 9:46am

Got it working, thank you very much!

system · June 16, 2017, 9:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple Spark jobs on same JVM Elasticsearch es-hadoop	2	926	August 14, 2017
Newbie question about Spark and Elasticsearch Elasticsearch	5	435	July 6, 2017
Multiple ES clusters in SparkSQL Elasticsearch es-hadoop	9	2877	July 6, 2017
Sub queries with Spark hadoop Elasticsearch es-hadoop	3	891	March 5, 2017
Question about Elasticsearch and Spark Elasticsearch	3	1362	July 6, 2017

Multiple Elastic Queries per spark job

Related topics