Thanks for your reply Costin, I am now trying Spark SQL and the pushdown is working great!
I've created 2 more queries:
Hello,
I've got some success working with Spark SQL CLI to access our ES data.
Environment:
ES: 1.3.2
es-hadoop: elasticsearch-hadoop-2.2.0
Spark: spark-1.4.1-bin-hadoop2.6
The pushdown feature is working great!
However, I'm stuck with how to specify a date range in the WHERE clause, so that it gets pushed down to ES?
I've tried:
SELECT count(member_id),member_id
FROM data
WHERE ( response_timestamp > CAST('2015-03-01' AS date) AND response_timestamp < CAST('2015-03-31' AS date) ) GRO…
Hello,
I've got some success working with Spark SQL CLI to access our ES data.
Environment:
ES: 1.3.2
es-hadoop: elasticsearch-hadoop-2.2.0
Spark: spark-1.4.1-bin-hadoop2.6
The pushdown feature is working great!
However, I noticed that for queries like:
SELECT count(member_id),member_id FROM data WHERE ( member_id > 2049510 AND member_id < 2049520 ) GROUP BY member_id ;
es-hadoop ends up reading all data from ES.
And hence these queries are taking way too long. In my case it took 44 …