Elasticsearch nested objects and parent child relationship


Currently we are using Spark Sql with Hive thrift server and Es-hadoop connector. Spark version : 1.4.1. Es-hadoop connector : 2.2.0. Elasticsearch 1.4.4.

If we have relationships in the Elasticsearch data model via parent-child relationship or nested objects, does the Es-hadoop connector push appropriate queries down to Elasticsearch? I mean does it convert the sql join statements to nested query or has_child or has_parent query?


No for several reasons:

  1. Hive doesn't allow any pluggable pushdown operations. ES-Hadoop only performs projections but it is unaware of what queries are done at runtime. Spark on the other hand does provide such features and thus the integration with Spark does support it.
  2. Joins are fairly complicated operations. They are not supported natively by Elastic anyway so even if such an operation would be pushed out, it wouldn't be possible to (easily) support it on Elastic.