I'm trying to make my Spark jobs parallel on a single JVM. Each job is querying my ES database. The problem is that the query is given in the configuration of the Spark Driver so I can only give one query. Is it possible to update the configuration or do you know any trick around it ?
You can specify configurations on a job by job basis for the ES-Hadoop Spark integration by using the saveToEs() methods that include a Map as an argument. If a configuration is not found in the passed in properties, it will default to properties set in the Spark Session, then fall back to command line and file based property sources.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.