NullPointerException when settings "es.read.field.as.array.include" options

ebuildy · October 17, 2016, 8:41pm

With Spark + ES, when I set:

conf.set("es.read.field.as.array.include", "api_search_ads_print_data.ads");

I got the following exception:

java.lang.NullPointerException
at java.lang.String.startsWith(String.java:1405)
at java.lang.String.startsWith(String.java:1434)
at org.elasticsearch.hadoop.serialization.field.FieldFilter.filter(FieldFilter.java:105)
at org.elasticsearch.hadoop.serialization.field.FieldFilter.filter(FieldFilter.java:132)
at org.elasticsearch.hadoop.serialization.builder.JdkValueReader.addToArray(JdkValueReader.java:121)

Is it clear enough or do you need more details?

james.baiera · October 17, 2016, 8:44pm

Adding the version of the connector you are working with and the versions of the integrating technologies (ES, Spark, Hadoop, etc...) helps a great deal in troubleshooting these problems.

james.baiera · October 17, 2016, 8:55pm

It's also normally helpful to include a small number of test records/mappings that can allow us to reproduce this faster.

ebuildy · October 17, 2016, 9:01pm

I am using:

Spark 1.6.2
elasticsearch 2.4
elasticsearch-hadoop: 2.4

I will try to setup a small sample tomorrow.

Thanks,

ebuildy · November 2, 2016, 12:36pm

Hello, quite easy to reproduce, a single document:

"a": {
                  "b": [
                     {
                        "c": "hello"
                     }
                  ]
               }

Settings conf.set("es.read.field.as.array.include", "a.b"); causes the NullPointerException. (I don't see this error when I use DataFrame, only RDD).

Actually, at https://github.com/elastic/elasticsearch-hadoop/blob/2.4/mr/src/main/java/org/elasticsearch/hadoop/serialization/builder/JdkValueReader.java#L121 currentFieldName is null.

dustinschultz · February 7, 2017, 10:42pm

@ebuildy were you able to resolve this? I am seeing the exact same error while trying to do an ETL from one index to another.

ebuildy · February 24, 2017, 12:06pm

Hello,

Nop, there are plenty issues with array (latest I found => Bug when reading if a field has no mapping (empty array by ex.) drive me crazy ^^).

So, I removed array for flat structure, no problem like that.

(having fields like toto_1:, toto_2: , toto_3 instead of toto: [])

Topic		Replies	Views
ElasticSearch+Hadoop+Spark Elasticsearch	2	964	July 6, 2017
Es.read.field.as.array.include multiple values Elasticsearch es-hadoop	2	1081	June 11, 2020
ES 5.3 and Java-API 5.3.0 > java.lang.NullPointerException? Elasticsearch	17	4144	June 10, 2017
Elasticsearch-spark-30 read missing field(double type) error Elasticsearch es-hadoop	9	1354	December 31, 2022
Elastic Search Hadoop Connector - Spark Facing Issues while Saving to ES Elasticsearch es-hadoop	4	1822	July 6, 2017

NullPointerException when settings "es.read.field.as.array.include" options

Related topics