Getting error while processing Nested Document :org.elasticsearch.hadoop.mr.WritableArrayWritable cannot be cast to org.apache.hadoop.io.MapWritable

Hi Team,
I am creating an external table on a nested JSON document looks like below.

{
  "rootcol1": "val1",
  "rootcol2": "vla2",
  "rootcol3": "val2",
  "rootcol4": "val3",
  "rootcol5": [
    {
      "childcol1": "vla5",
      "childcol2":[
	{"innercol1":"innervalue1"},
	{"innercol2":"innervalue2}"
      ]
    }
    ]
}

Below is the create table statement .

create external table elasticserach_pool_59(rootcol1 STRING,rootcol2 STRING,rootcol3 STRING,childcol1 STRING) 
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
   'es.nodes'='Endpoint Name', 
   'es.port'='port',
   'es.resource'='indexname' , 
   'es.nodes.wan.only' = 'true',
   'es.nodes.discover'='true',
'es.mapping.names'='rootcol1 :rootcol1 ,rootcol2 :rootcol2 ,rootcol3:rootcol3,childcol1:rootcol5.childcol1',
'es.read.field.as.array.exclude'='rootcol5.*'
);

I am getting the error when I execute select statement.

:org.elasticsearch.hadoop.mr.WritableArrayWritable cannot be cast to org.apache.hadoop.io.MapWritable

I tried with exclude option as well , but getting same exception. If I am not including the child columns ,I able to see the Data without any issue.
Please assist me on this issue. Thanks in advance.

I am new to this as well but i created external table like this
create external table vgsale_es(
rank int,
....
stored by "org.elasticsearch.hadoop.hive.EsStorageHandler"
tblproperties("es.resource"="vgsale_es/vgsale",
"es.index.auto.create"="true",
"es.nodes"="localhost");

But the file i uploaded is csv

Thanks Varun.
But I am using ES-SPARK for index the document and creating External table to retrieve the data. But I am indexing the Nested Json Document( Please see my document in my quesition) , not CSV file.

The table that i created is ES-Hive to create index in elasticsearch.

Hi @Srinivas2, can you share the result of the index mapping endpoint for the index on Elasticsearch you are trying to query? I'm thinking there might be a misalignment of your es.read.field.as.array.exclude setting compared to the index mapping you're trying to read.

@james.baiera Thanks James.
Below is the mapping (I have updated the document as well). I have manipulated document (removed the some of the fields )as well. However the original document looks like same here. I need to extract parent documents along with child columns so I used DOT notation in es.mapping.names .

{
  "test_nested" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "rootcol1" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "rootcol2" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "rootcol3" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "rootcol4" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "rootcol5" : {
            "properties" : {
              "childcol1" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "childcol2" : {
                "properties" : {
                  "innercol1" : {
                    "type" : "text",
                    "fields" : {
                      "keyword" : {
                        "type" : "keyword",
                        "ignore_above" : 256
                      }
                    }
                  },
                  "innercol2" : {
                    "type" : "text",
                    "fields" : {
                      "keyword" : {
                        "type" : "keyword",
                        "ignore_above" : 256
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

@james.baiera : Hi James , Did you get chance look into issue. I created an index with mapping as rootcol5 as nested type , but I am getting the same error. I used es.read.filed.as.array.include also but no luck. I Followed the Official doc for these as rootcol5 contains array of Objects.

Could you please assist me on this.

Thanks
Srinivas

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.