Hive JSON integration and Deprecated _ID capability

corgone · December 4, 2015, 7:26pm

How do you explicitly set the _ID field when indexing via HIVE using the "es.input.json" option? in 1.7 we were able to use the PATH command to extract the ID from the corresponding JSON record. However, now I'm unsure how to use data from the JSON for the ID.... or if you can. I tried cheating and giving the mapping command a json path (a la the RESOURCE option for document type), but no joy.

Thanks,
Chris

costin · December 8, 2015, 2:26pm

See the mapping options. Just add this to your table configuration.

corgone · December 8, 2015, 4:22pm

I tried the es.mapping.id originally, but it expects a HIVE field to exist (hence the map). I was trying to maintain the JSON only setting while pulling an ID from the JSON string. In reality, it doesn't really matter. Having ES automatically create the _ID field is okay, I'll just need to go into the JSON for any queries on the real ID. Still, it would be nice to be able to push this up using the hadoop libraries.

costin · December 8, 2015, 4:47pm

The field extractor works on both JSON and non-JSON/Hive input. What was your setup and error?

corgone · December 15, 2015, 8:55pm

Looks like I made a mistake. I tried to set the es.mappind.id like the es.resource with {} around the field name. Changing it to just the field_name (in the JSON) worked. I didn't find anything in the documentation that mentioned format.

e.g. for a JSON string that looked like:
{"id":"xxxx", "item": {"key1": 1, "key2",2}}
...
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'test/doc',
'es.nodes'='my.node',
'es.mapping.id' = 'id',
'es.input.json' ='yes');

works fine.

Thanks,
Chris

costin · December 18, 2015, 12:41pm

Well, what made you think you should be using {. The connector can extract from both JSON or library types hence why only the actual field name is relevant.

Topic		Replies	Views
Getting _id field in elasticsearch to map to a field in HIVE Elasticsearch	4	1906	November 4, 2022
Issue :Hive support for ES for nested field(JSON string) with other string fields Elasticsearch es-hadoop	1	1992	July 6, 2017
Reading json data from ES to HIVE with a single string field Elasticsearch es-hadoop	4	1667	July 6, 2017
JSON array mapping into Hive Elasticsearch es-hadoop	2	2896	February 13, 2017
Correction in the variable name for ES Hive Documentation in es.json parameter Elasticsearch	3	481	July 6, 2017

Hive JSON integration and Deprecated _ID capability

Related topics