ES Aggregations in Spark


#1

Hello everyone,

based on discussion about ES use cases I was wondering whether there is any way how Spark could benefit from ES aggregations and convert them to Dataframe. F.e something like:
val esConf = ... val esQuery = """{"agg" : {"my_agg" : {"terms" : {"field": "field_A"} } } }""" val jsonResult = client.search(esQuery, esConf) val transformer = ... val df = jsonResult.toDF(transformer) val result = df.filter(...).join(otherDf) ....

A) Is there any plan to support something similar in ES roadmap?
B) As I understood correctly how spark-es/hadoop-es works library based on scroll's json results "detects" dataframe's schema. Can you direct me to classes which is responsible for this detection? I was wondering whether these components could be used for building 'transformer' I had in my example.

Thanks advance for your suggestions
-Jan


(Costin Leau) #2

A) Aggregations are not currently supported by ES-Hadoop; it's the next major item on the roadmap.
B) All the spark SQL classes reside under their dedicated package:


(system) #3