Is it possible to write to ES from a json file in HDFS where JSON file has inconsistent or different keys in different records

siva_mannem · April 1, 2014, 6:14pm

Hi,

my json file is like this
+++++++++++++++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3" , "k4":"v4" , "k5":"v5"}

{"k12":"v11" , "k23":"v22" , "k34":"v33" , "k45":"v44" ,
"k56":"v55"}

{"k1":"v111" , "k2":"v222" , "k3":"v333" , "k4":"v444" , "k5":"v555"}

{"k123":"v12" , "k234":"v23" , "k345":"v34" , "k456":"v45" ,
"k567":"v56"}
+++++++++++++++++++++

my pig script is like this
+++++++++++++++++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;

DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.nodes=gateway1 ,
es.resource=ca/sf');

A = LOAD '/elastic_search/in_dir/' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray,k4:chararray,k5:chararray');

B = FOREACH A GENERATE k1, k3, k5;
+++++++++++++++++++++++++++++

I am expecting a output like this
+++++++++++++++
(v1,v3,v5)
(v111,v333,v555)
++++++++++++++++++

but i am getting a output like this
++++++++++++
(v1,v3,v5)
(v11,v33,v55)
(v111,v333,v555)
++++++++++++++

is there any way to ignore the second record as there are no keys K1, K3
and k5 in second record?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/410f55ac-7e43-4789-83dc-eb4958fa2d55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

siva_mannem · April 3, 2014, 4:45pm

sorry.
i am getting a output like this
++++++++++++
(v1,v3,v5)
(v11,v33,v55)
(v111,v333,v555)
(v12, v34, v56)
++++++++++++++

On Tuesday, April 1, 2014 11:14:40 AM UTC-7, siva mannem wrote:

Hi,

my json file is like this
+++++++++++++++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3" , "k4":"v4" ,
"k5":"v5"}

{"k12":"v11" , "k23":"v22" , "k34":"v33" , "k45":"v44" ,
"k56":"v55"}

{"k1":"v111" , "k2":"v222" , "k3":"v333" , "k4":"v444" , "k5":"v555"}

{"k123":"v12" , "k234":"v23" , "k345":"v34" , "k456":"v45" ,
"k567":"v56"}
+++++++++++++++++++++

my pig script is like this
+++++++++++++++++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;

DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.nodes=gateway1 ,
es.resource=ca/sf');

A = LOAD '/elastic_search/in_dir/' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray,k4:chararray,k5:chararray');

B = FOREACH A GENERATE k1, k3, k5;
+++++++++++++++++++++++++++++

I am expecting a output like this
+++++++++++++++
(v1,v3,v5)
(v111,v333,v555)
++++++++++++++++++

but i am getting a output like this
++++++++++++
(v1,v3,v5)
(v11,v33,v55)
(v111,v333,v555)
++++++++++++++

is there any way to ignore the second record as there are no keys K1, K3
and k5 in second record?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/51fceb7f-fc25-410b-911c-f283f8ecf5e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Unable to write data to elasticsearch using hadoop PIG Elasticsearch	9	653	July 6, 2017
[Hadoop] writing ES string array from Pig using elasticsearch-hadoop plugin Elasticsearch	1	361	July 6, 2017
Export elasticsearch to a JSON file with Pig Elasticsearch es-hadoop	3	1067	July 6, 2017
Writing to dynamic/multi-resources not working with Pig and ES-Hadoop 2.2 Elasticsearch es-hadoop	15	2561	July 6, 2017
Storing into Elasticsearch using Apache Pig Elasticsearch es-hadoop	17	1588	July 6, 2017

Is it possible to write to ES from a json file in HDFS where JSON file has inconsistent or different keys in different records

Related topics