Spark-SQL: Ensure SQL query gets translated to ES query

Hi,
I am attempting to get started with the spark elasticsearch connector, and I notice that my SQL query never gets translated to ES search query with pushdown enabled.

Could you let me know what is wrong in the following set of steps? I registered the dataFrame through Spark's DataSource, but still dont see the query getting executed on ElasticSearch.

SparkConf conf = new SparkConf().setAppName("Simple Application"); Map<String,String> dataFrameOptions = new HashMap<String,String>(); dataFrameOptions.put("es.resource", "myindex/account"); dataFrameOptions.put("es.nodes","192.168.224.94"); dataFrameOptions.put("es.port","9200"); dataFrameOptions.put("es.index.auto.create","no"); dataFrameOptions.put("es.nodes.discovery","false"); dataFrameOptions.put("pushdown","true"); dataFrameOptions.put("double.filtering","false"); JavaSparkContext sc = new JavaSparkContext(conf); SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc); DataFrame myEsDump = sqlContext.read().format("org.elasticsearch.spark.sql").options(dataFrameOptions).load("myindex/account"); myEsDump.registerTempTable("allAccounts"); DataFrame accounts = sqlContext.sql("SELECT name FROM allAccounts WHERE name = 'Name-888'");

Here are the versions that I am using

<dependency> <!-- Spark dependency --> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.6.0</version> </dependency> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch-spark_2.10</artifactId> <version>2.2.0-rc1</version> </dependency>

Also, in the logs I see something as below, however, I dont see the query being run on ElasticSearch ( I have search/fetch slow logs enabled for 0s)

16/02/01 11:49:23 DEBUG DataSource: Pushing down filters [EqualTo(name,Name-888)] 16/02/01 11:49:23 TRACE DataSource: Transformed filters into DSL $filterString

Turns out that it is issuing scan/scroll search commands. However, my index_search and index_fetch logging is not showing it, which led me to an incorrect presumption. Packet tracing revealed that the queries are being sent to ES.

Fwiw, 2.2 GA actually fixed the logs messages and on TRACE/DEBUG mode one sees the actual query from both Spark and ES.