Null data from hive table with only index (excluding type)


(Lee) #1

Hi All,

I have met a strange problem as below:

Steps1:

  1. Create one hive table with only index (like 'es.resource' = 'tomcat' without _type)
  2. Query the table

Results1:
I got null data occasionally! And sometimes the reponse data is correct.

steps2:

  1. Create one hive table with index and type ( like 'es.resource' = 'tomcat/err')
  2. Query the table

Results2:
I got correct data and no null data any more.

Question:
Does it have to set index and type in es.resource ? Is there any solution to set multiple _type with index such as "tomcat/err", "tomcat/std"

Notice:
elasticsearch version is 2.4.5
elasticsearch-hadoop-jar is 2.4.5
hadoop is 2.7.2
hive is 2.0.1

Thanks,

Dante


(James Baiera) #2

This is a bug with how mappings are parsed in the connector. When the connector starts up it tries to discover the mappings for the supplied resources. It pulls all available mappings for all types that the resource corresponds to, but only uses the first mapping. Any records it finds that have fields not in this mapping have their fields discarded, leading to mostly records with nulls. This is fixed in the connector as of 6.0.0.


(Lee) #3

Got it.

I will try to upgrade the ES vesion and the connector.

Thanks for your help.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.