Elasticsearch Hadoop - Issue reading GeoPoint as Dataset


Hi everyone,

I am using Elasticsearch Hadoop v. 6.2.3 with Spark 2.3.0 trying to read data from Elasticsearch as a Dataset. However, I am receiving following Exception:

WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,, executor 0): java.lang.IndexOutOfBoundsException: 1
at scala.collection.convert.Wrappers$JListWrapper.productElement(Wrappers.scala:85)
at scala.runtime.ScalaRunTime$$anon$1.next(ScalaRunTime.scala:177)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)

I am using Scala and have following case class:
case class MyEntity (
geometry: Geometry

Geometry looks like this:
case class Geometry (
type: String,
coordinates: GeoPoint

And GeoPoint looks like this:
case class GeoPoint (
lat: Double,
lon: Double

The schema of the data looks as follows:
[info] root
[info] |-- geometry: struct (nullable = true)
[info] | |-- coordinates: struct (nullable = true)
[info] | | |-- lat: double (nullable = true)
[info] | | |-- lon: double (nullable = true)
[info] | |-- type: string (nullable = true)

I am reading the data as dataset as follows:
val myEntities = sqlContext

Without the geometry, I can successfully read the data from Elasticsearch, so it is definitely related to that.

Do you have an idea for the cause of the issue and know how to workaround that issue?



By the way, it seems to be happening within the select statement, not during the conversion to a dataset, since I am receiving the same issue when removing "as[MyEntity]" and just calling "myEntities.show".


This might be related: https://github.com/elastic/elasticsearch-hadoop/issues/951

It is highly likely that this is related to #951.

Can you post on that issue with your above reproduction for the bug. The more test cases we have on file when I tackle this issue, then the better the solution will be tested before it gets released.


Thanks for your quick reply. I have commented on the Github issue.

