mapping just as following:
"properties": {
"BH": {
"type": "string"
},
"GD": {
"type": "long"
},
"GJ": {
"type": "string"
},
"GPS": {
"type": "geo_point"
},
"HX": {
"type": "double"
},
"MC": {
"type": "string"
},
"MC2": {
"type": "string"
},
"MC3": {
"type": "string"
},
"MC4": {
"type": "string"
},
"SJ": {
"type": "date",
"format": "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
}
}
the test data :
{
"GPS":{
"lat":21.22222,
"lon":22.111111
}
}
Then use es-hadoop to cache data to Spark
if the mount of data is below 10,000, all works well.
While cache more data, Spark throw the following errors:
17/03/07 11:18:52 ERROR Executor: Exception in task 30.0 in stage 19.0 (TID 166)
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:114)
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
at
Then I tried to update all gps fields, with array format:
{
"GPS":[
21.22222,
22.111111
]
}
Well, all success. Although Spark cache geo_point as array type.
Above all, I just wonder what's the correct way to cache geo_point type fields in Spark to avoid type error?
language: scala
es version: 2.3.2
es-hadoop version: 5.1
spark version:2.0.2
Thanks for your help!