A typical record in my ES index looks like:
"_source": {
"app": "panoply",
"response": {
"category": "uncategorized",
"subcategory": "uncategorized",
"activity_common_name": "name123",
"score": 0,
"duration_secs": 2,
"sub_activity": null,
"activity": "name123"
},
"member_id": 2357919,
"device_user_identity": 1688734,
"activity_type": "type123",
"response_timestamp": "2016-01-10T23:05:18.000Z"
}
When I created a TABLE using Spark Shell as follows:
sql("""
CREATE TEMPORARY TABLE jan10
USING org.elasticsearch.spark.sql
OPTIONS (
resource 'cortez/data',
nodes 'localhost',
port '9201',
scroll_size '500',
query '?response_timestamp:[2016-01-01 TO 2016-01-10]',
read_field_include 'member_id,response.category,response.subcategory,response.activity,response.activity_common_name,response.duration_secs,response.sub_activity,response_timestamp'
) """)
**Note:** response.score is not included in 'read_field_include'
and executed
sql("""SELECT * from jan10""").show()
I observed that all field values after response.score (namely duration_secs, sub_activity and activity) are showing up as null.
If I add response.score to 'read_field_include', all filed values are fetched correctly.
Seems like a bug.
Can you please check.
Thanks.