With Spark + ES, when I set:
I got the following exception:
Is it clear enough or do you need more details?
Adding the version of the connector you are working with and the versions of the integrating technologies (ES, Spark, Hadoop, etc...) helps a great deal in troubleshooting these problems.
It's also normally helpful to include a small number of test records/mappings that can allow us to reproduce this faster.
I am using:
- Spark 1.6.2
- elasticsearch 2.4
- elasticsearch-hadoop: 2.4
I will try to setup a small sample tomorrow.
Hello, quite easy to reproduce, a single document:
Settings conf.set("es.read.field.as.array.include", "a.b"); causes the NullPointerException. (I don't see this error when I use DataFrame, only RDD).
Actually, at https://github.com/elastic/elasticsearch-hadoop/blob/2.4/mr/src/main/java/org/elasticsearch/hadoop/serialization/builder/JdkValueReader.java#L121 currentFieldName is null.
@ebuildy were you able to resolve this? I am seeing the exact same error while trying to do an ETL from one index to another.
Nop, there are plenty issues with array (latest I found => Bug when reading if a field has no mapping (empty array by ex.) drive me crazy ^^).
So, I removed array for flat structure, no problem like that.
(having fields like toto_1:, toto_2: , toto_3 instead of toto: )