How to write Parent-child relationship and has_child queries in Spark?


#1

Hello

I am a beginner in Elasticsearch-Spark.
My data is created in a Spark task, and it will be stored into Elasticsearch.
My problem is that, I have two types, namely "branch" and "employee", which are in parent-child relationship.
Now I cannot find the way to define their parent-child relationship when transferring data to Elasticsearch from Spark.

In detail, I have the mapping below defined in my Elasticsearch:
curl -XPUT 'localhost:9200/company?pretty=1' -d'
{
"mappings": {
"branch": {},
"employee": {
"_parent": {
"type": "branch"
}
}
}
}'

Then, I put several documents into type branch:
curl -XPOST 'localhost:9200/company/branch/_bulk?pretty' -d '
{ "index": { "_id": "london" }}
{ "name": "London Westminster", "city": "London", "country": "UK" }
{ "index": { "_id": "liverpool" }}
{ "name": "Liverpool Central", "city": "Liverpool", "country": "UK" }
{ "index": { "_id": "paris" }}
{ "name": "Champs élysées", "city": "Paris", "country": "France" }

Next, I insert a document of type employee into the index company,
curl -XPUT 'localhost:9200/company/employee/1?parent=london' -d'
{
"name": "Alice Smith",
"dob": "1970-10-24",
"hobby": "hiking"
}'

Then, inside Spark-Shell
scala> val options = Map("pushdown" -> "true", "es.read.metadata" -> "true")
scala> val esDf = sqlContext.read.format("org.elasticsearch.spark.sql").options(options).load("company/employee")
scala> esDf.printSchema
root
|-- dob: timestamp (nullable = true)
|-- hobby: string (nullable = true)
|-- name: string (nullable = true)
|-- _metadata: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)

NO PARENT information shown in the schema.
Therefore, I wonder how can I specify "parent=london" in Spark when using functions like saveToEs ?
Also, I wonder whether or not I can conduct has_child, has_parent queries to Elasticsearch through Spark? What are the functions or configurations ?

I use Elasticsearch 2.2.0, Spark 1.6.0 and elasticsearch-spark_2.10-2.2.0.jar

Many Thanks.


(system) #2