Hello, I am trying to use the setting es.read.field.as.array.exclude
From the name of it, I understand that it must produce the opposite result of es.read.field.as.array.include
.
es.read.field.as.array.include
transforms the type a field to array so it seems reasonable the exclude
would transform the type of a field from an array to its element type.
The documentation states that:
es.read.field.as.array.exclude
Fields/properties that should be considered as arrays/lists
I think that this might just be a copy-paste error.
I made an experiment and created an index with the following mapping
{
"test_index": {
"mappings": {
"test_type": {
"dynamic": "strict",
"_all": {
"enabled": false
},
"properties": {
"outer": {
"type": "nested",
"properties": {
"inner1": {
"type": "boolean"
},
"inner2": {
"type": "keyword"
}
}
}
}
}
}
}
}
Nested fields are transformed by default to arrays but in my case the field is just a struct.
Loading the data with the option
set(ConfigurationOptions.ES_READ_FIELD_AS_ARRAY_EXCLUDE, "outer")
gives me the following schema:
root
|-- outer: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- inner1: boolean (nullable = true)
| | |-- inner2: string (nullable = true)
As you can see the the field outer
has the type of array
despite the fact that it was part of the exclusions.
Am I missing something?
How does ConfigurationOptions.ES_READ_FIELD_AS_ARRAY_EXCLUDE
work?
I' m using spark 2.3.2, elasticsearch 6.5.0 and elasticsearch-hadoop 6.5.0