Elasticsearch Schema incompatibility with Hive Parquet


(Jae) #1

Hello Elasticsearch-Hadoop user

I got a little bit complicated situation. I am trying to back up Elasticsearch data to Hive through parquet file format with schema. Writing to parquet file was ok but while registering its schema to Hive metadata store, I got the following error:

Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:208 mismatched input '-' expecting > near 'timestamp' in struct type

Somebody who's in charge of hadoop/hive cluster told me this problem is already known in parquet-hive incompatibility problem and I should've used long or string type for date format but I cannot control in Elasticsearch-hadoop connector.

Is there any hidden :slight_smile: feature in elasticsearch-hadoop connector to convert date format to long or string type? Otherwise, if I have to touch source codes and build my own library, may I have some advices on touching source codes?

Thank you!


(James Baiera) #2

Is there any hidden :slight_smile: feature in elasticsearch-hadoop connector to convert date format to long or string type?

@jae setting es.mapping.date.rich = false will instruct the connector to convert date oriented fields into Strings or Longs. This (not-secret-at-all :slight_smile:) property is explained further in depth here, along with the rest of the available user properties.


(Jae) #3

Wow, thank you so much, I should've read the document more carefully. Thank you tons again :slight_smile:


(system) #4