Pushing data from Hive to Elastic Search

Abhishek_Andhavarapu · April 23, 2013, 9:25pm

Hi All,

I'm trying to push data from hive to elastic search using external tables (
https://github.com/elasticsearch/elasticsearch-hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not arrays
and maps. How do I push data from Hive to ES.
Is a Hive river I could use ?
How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

costin · April 24, 2013, 6:44am

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.
ES-Hadoop integration sits outside ES. It just something added to the
Hadoop env to talk to Hadoop and the reason for that is to take advantage
of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel capabilities
of Hadoop void.
Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external tables
( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not arrays
and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhishek_Andhavarapu · April 24, 2013, 4:25pm

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(Native Method)
at java.util.ArrayList.toArray(ArrayList.java:306)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
                          at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
                          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
                          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
                          at java.security.AccessController.doPrivileged(Native Method)
                          at javax.security.auth.Subject.doAs(Subject.java:396)
                          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
                          at org.apache.hadoop.mapred.Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(Native Method)
                          at java.util.ArrayList.toArray(ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
                          at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to the
Hadoop env to talk to Hadoop and the reason for that is to take advantage
of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external tables
( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not arrays
and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

costin · April 25, 2013, 4:20pm

Looks like an error in ESSerDe for which I've raised an issue:

github.com/elastic/elasticsearch-hadoop

Serialization bug in ESSerDe.hiveToWritable

opened 04:16PM - 25 Apr 13 UTC

closed 05:20PM - 29 Apr 13 UTC

costin

bug :Hive v1.3.0.M1

The issue is explained here: ``` Caused by: java.lang.ArrayStoreException at ja…va.lang.System.arraycopy(Native Method) at java.util.ArrayList.toArray(ArrayList.java:306) at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136) at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197) at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586) ``` https://groups.google.com/forum/?fromgroups=#!topic/elasticsearch/BAaoqF6SkiY

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(Native Method)
at java.util.ArrayList.toArray(ArrayList.java:306)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
... 9 more
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
                          at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
                          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
                          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
                          at java.security.AccessController.doPrivileged(Native Method)
                          at javax.security.auth.Subject.doAs(Subject.java:396)
                          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
                          at org.apache.hadoop.mapred.Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(Native Method)
                          at java.util.ArrayList.toArray(ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
                          at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to the
Hadoop env to talk to Hadoop and the reason for that is to take advantage
of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not
arrays and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhishek_Andhavarapu · April 25, 2013, 4:22pm

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costin.leau@gmail.com wrote:

Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:
Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to the
Hadoop env to talk to Hadoop and the reason for that is to take advantage
of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoop https://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not
arrays and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

costin · April 29, 2013, 5:35pm

The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau <costi...@gmail.com<javascript:>

wrote:
Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:
Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to
the Hadoop env to talk to Hadoop and the reason for that is to take
advantage of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoop https://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not
arrays and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can es
storage handler delete the existing es document and insert the new/ updated
doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhishek_Andhavarapu · April 29, 2013, 11:01pm

Costin, Thanks. Its work great. Only problem I see is if the map key/value
and array data type is int. I see random values in ES. Works great with
Strings. I know I can force the mapping on the ES side to be int but just
wondering if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:

The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:
Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costi...@gmail.com wrote:
Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu
wrote:
Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to
the Hadoop env to talk to Hadoop and the reason for that is to take
advantage of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoop https://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not
arrays and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can
es storage handler delete the existing es document and insert the new/
updated doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhishek_Andhavarapu · April 30, 2013, 4:00pm

Costin,

integer data type doesn't work at all. I've added logs to
BufferedRestClient addtoIndex method. The first log is the writable object.
toString and the second log is the mapper write value as string. Integer
data type used to be fine before this commit though.

2013-04-30 09:57:11,466 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=[B@1a3650ed, rdata={[B@4e0a2a38=8, [B@7d59ea8e=9}, rdate=1234, mapids=[[B@63fb050c, [B@75088a1b, [B@3a32ea4]}

2013-04-30 09:57:11,536 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: ES index query{"index":{}}
{"rid":"AAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","rdata":{"[B@4e0a2a38":"8","[B@7d59ea8e":"9"},"rdate":"1234","mapids":["AAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="]}

Please let me know if you need more information I can add logs.

Thanks,
Abhishek

On Monday, April 29, 2013 5:01:03 PM UTC-6, Abhishek Andhavarapu wrote:

Costin, Thanks. Its work great. Only problem I see is if the map key/value
and array data type is int. I see random values in ES. Works great with
Strings. I know I can force the mapping on the ES side to be int but just
wondering if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:
The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:
Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costi...@gmail.comwrote:
Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu
wrote:
Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

What's the problem? Any error message that you receive? Except for
UNIONs, Arrays (or List) as well as Map should work.

ES-Hadoop integration sits outside ES. It just something added to
the Hadoop env to talk to Hadoop and the reason for that is to take
advantage of the map/reduce capabilities which map nicely on top of ES.
A river or a single-instance process would render the parallel
capabilities of Hadoop void.

Hive doesn't support any UPDATE statement - just INSERT and INSERT
OVERWRITE which doesn't really apply here since it's an external table. We
might extend INSERT OVERWRITE semantics but that is tricky since it
requires the notion of ID - typically insert overwrite is the equivalent of
dropping a table and then adding data into it, which is clearly not an
update.
You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks)
data is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoop https://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

The push works for simple data type like int and string but not
arrays and maps. How do I push data from Hive to ES.

Is a Hive river I could use ?

How do I update the document in es? (If a row already exists can
es storage handler delete the existing es document and insert the new/
updated doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

costin · May 1, 2013, 12:42pm

Hi Abhishek,

I've fixed this in master and pushed a snapshot with the fix [1]. Let me know how it works for you.

Cheers,

[1] http://build.elasticsearch.org/browse/ESHADOOP-NIGHTLY-36

On 4/30/2013 7:00 PM, Abhishek Andhavarapu wrote:

Costin,

integer data type doesn't work at all. I've added logs to BufferedRestClient addtoIndex method. The first log is the
writable object. toString and the second log is the mapper write value as string. Integer data type used to be fine
before this commit though.

2013-04-30 09:57:11,466 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=[B@1a3650ed, rdata={[B@4e0a2a38=8, [B@7d59ea8e=9}, rdate=1234, mapids=[[B@63fb050c, [B@75088a1b, [B@3a32ea4]}

2013-04-30 09:57:11,536 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: ES index query{"index":{}}
                           {"rid":"AAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","rdata":{"[B@4e0a2a38":"8","[B@7d59ea8e":"9"},"rdate":"1234","mapids":["AAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="]}

Please let me know if you need more information I can add logs.

Thanks,
Abhishek

On Monday, April 29, 2013 5:01:03 PM UTC-6, Abhishek Andhavarapu wrote:

Costin, Thanks. Its work great. Only problem I see is if the map key/value and array data  type is int. I see random
values in ES. Works great with Strings. I know I can force the mapping on the ES side to be int but just wondering
if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:

    The issue has been fixed in master.

    Cheers!

    On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

        Thanks Costin.


        On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau <costi...@gmail.com> wrote:

            Looks like an error in ESSerDe for which I've raised an issue:
            https://github.com/elasticsearch/elasticsearch-hadoop/issues/39
            <https://github.com/elasticsearch/elasticsearch-hadoop/issues/39>


            On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

                Thanks Costin for the reply. Here is the error.

                   2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/__user/hive/warehouse/maptest3
                                               2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: dump TS struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:50,997 INFO ExecMapper:
                                               <MAP>Id =3
                                               <Children>
                                               <TS>Id =0
                                               <Children>
                                               <SEL>Id =1
                                               <Children>
                                               <FS>Id =2
                                               <Parent>Id = 1 null<\Parent>
                                               <\FS>
                                               <\Children>
                                               <Parent>Id = 0 null<\Parent>
                                               <\SEL>
                                               <\Children>
                                               <Parent>Id = 3 null<\Parent>
                                               <\TS>
                                               <\Children>
                                               <\MAP>
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initializing Self 3 MAP
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing Self 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Operator 0 TS initialized
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing children of 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing child 1 SEL
                                               2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing Self 1 SEL
                                               2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: SELECT struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Operator 1 SEL initialized
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing children of 1 SEL
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing child 2 FS
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing Self 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Operator 2 FS initialized
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initialization Done 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initialization Done 1 SEL
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initialization Done 0 TS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initialization Done 3 MAP
                                               2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.__local:8020/user/hive/__warehouse/maptest3
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarding 1 rows
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.__maptest1/000000_3
                                               2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more

                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 finished. closing...
                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: DESERIALIZE_ERRORS:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 forwarded 0 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 Close done
                                               2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                                               2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.__TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                                               2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.__Child: Error running child
                                               java.lang.RuntimeException: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:161)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               ... 8 more
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more
                                               2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task


                Thanks,


                On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

                    Hi,

                    1) What's the problem? Any error message that you receive? Except for UNIONs, Arrays (or List)
                    as well as Map should work.
                    2) ES-Hadoop integration sits outside ES. It just something added to the Hadoop env to talk to
                    Hadoop and the reason for that is to take advantage of the map/reduce capabilities which map
                    nicely on top of ES.
                    A river or a single-instance process would render the parallel capabilities of Hadoop void.
                    3) Hive doesn't support any UPDATE statement - just INSERT and INSERT OVERWRITE which doesn't
                    really apply here since it's an external table. We might extend INSERT OVERWRITE semantics but
                    that is tricky since it requires the notion of ID - typically insert overwrite is the equivalent
                    of dropping a table and then adding data into it, which is clearly not an update.
                    You are better off handling the UPDATE directly in ES.

                    Note that in Hive (as with the rest of the map/reduce frameworks) data is not updated, but
                    rather copied and transformed.

                    Cheers,

                    On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

                        Hi All,

                        I'm trying to push data from hive to elastic search using external tables (
                        https://github.com/__elasticsearch/elasticsearch-__hadoop
                        <https://github.com/elasticsearch/elasticsearch-hadoop> )

                        My ES index mapping

                        {
                           "rid": 1,
                           "mapids" : [2,3,4], //Array
                           "data": [ //Nested objects
                             {
                               "mapid": "5",
                               "value": "g1"
                             },
                        {
                               "mapid": "6",
                               "value": "g2"
                             }
                           ]
                        }

                        My Hive table structure

                        CREATE EXTERNAL TABLE maptest_ex(
                             rid      INT,
                             mapids  ARRAY<INT>,
                             rdata     MAP<INT,STRING>)
                        STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
                        TBLPROPERTIES(
                        'es.host' = 'elasticsearch1',
                        'es.resource' = 'radio/artists/')

                        and I'm trying to push data from local hive table to the external table

                        insert into table maptest_ex
                           select rid,mapids,rdata from maptest3

                        1) The push works for simple data type like int and string but not arrays and maps. How do I
                        push data from Hive to ES.
                        2) Is a Hive river I could use ?
                        3) How do I update the document in es? (If a row already exists can es storage handler
                        delete the existing es document and insert the new/ updated doc)

                        Any help is appreciated,

                        Thanks

            --
            You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
            To unsubscribe from this topic, visit
            https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
            <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US>.
            To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.
            For more options, visit https://groups.google.com/groups/opt_out <https://groups.google.com/groups/opt_out>.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Fabien_Chung · June 25, 2013, 5:39pm

Hi all,

I'm encountering some troubles to understand what i should do to push data
from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Abhishek_Andhavarapu · June 25, 2013, 6:02pm

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index schema.

Your external table structure has three columns and your select statement has only two fields name and id. Try select star(all) or change external table schema to just name and id. You don't need to worry about the mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung.fabien@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' = 'radio/artists/');
That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL = http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a silly question

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Fabien_Chung · June 26, 2013, 9:29am

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

{
- _index: radio
- _type: artists
- _id: 1
- _score: 1
- _source: {
  - id: 1
  - name: tata
    }
    }
{
- _index: radio
- _type: artists
- _id: 2
- _score: 1
- _source: {
  - id: 2
  - name: test
    }

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek abhishek376@gmail.com

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index
schema.

Your external table structure has three columns and your select statement
has only two fields name and id. Try select star(all) or change external
table schema to just name and id. You don't need to worry about the
mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung.fabien@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push data
from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :
GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mohit_Kumar · September 18, 2014, 9:05am

Hi Fabien,
I also getting the same error message. can you please tell me what is the
solution for it if you have get rid of this error.?
thanks in advance
regards
Mohit Kumar Yadav

On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

{

_index: radio

_type: artists

_id: 1

_score: 1

_source: {

id: 1

name: tata
}
}

{

_index: radio

_type: artists

_id: 2

_score: 1

_source: {

id: 2

name: test
}

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek <abhis...@gmail.com <javascript:>>

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index
schema.

Your external table structure has three columns and your select statement
has only two fields name and id. Try select star(all) or change external
table schema to just name and id. You don't need to worry about the
mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung <chung....@gmail.com
<javascript:>> wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push
data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :
GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive
table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fabien_Chung · September 18, 2014, 9:10am

Hi,

I would be glad to help you unfortunatly I stop for 1 year using ES and I
can't remember how I solved this issue.

Sorry,

Fabien

2014-09-18 11:05 GMT+02:00 Mohit Kumar Yadav mohit.kumar.ngi@gmail.com:

Hi Fabien,
I also getting the same error message. can you please tell me what is the
solution for it if you have get rid of this error.?
thanks in advance
regards
Mohit Kumar Yadav

On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

{

_index: radio

_type: artists

_id: 1

_score: 1

_source: {

id: 1

name: tata
}
}

{

_index: radio

_type: artists

_id: 2

_score: 1

_source: {

id: 2

name: test
}

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek abhis...@gmail.com

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES
index schema.

Your external table structure has three columns and your select
statement has only two fields name and id. Try select star(all) or change
external table schema to just name and id. You don't need to worry about
the mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung....@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push
data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :https://github.com/
elasticsearch/elasticsearch-hadoop#configuration-properties

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive
table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?
jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill
job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://hdpnoddev1.intranet.
darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.
jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
ReflectionUtils.java:106)
at org.apache.hadoop.util.ReflectionUtils.setConf(
ReflectionUtils.java:72)
at org.apache.hadoop.util.ReflectionUtils.newInstance(
ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.
java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's
a silly question

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Chung Fabien

Consultant Junior YSANCE
Tel : +33 6 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

costin · September 18, 2014, 11:22am

It's quite easy - the es.resource format is incorrect. It should be index/type as in radio/artists
If you want/need to specify an ID or other criteria, you should do so in the query (through es.query).

On 9/18/14 12:10 PM, Fabien Chung wrote:

Hi,

I would be glad to help you unfortunatly I stop for 1 year using ES and I can't remember how I solved this issue.

Sorry,

Fabien

2014-09-18 11:05 GMT+02:00 Mohit Kumar Yadav <mohit.kumar.ngi@gmail.com mailto:mohit.kumar.ngi@gmail.com>:

Hi Fabien,
I also getting the same error message. can you please tell me what is the solution for it if you have get rid of
this error.?
thanks in advance
regards
Mohit Kumar Yadav


On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

    Hi,

    thanks for your answer. Sorry I didn't copy the right command line.

    Any way I still have the same probleme to read write on ES from hive.

    in ES :

      * {
          o _index: radio
          o _type: artists
          o _id: 1
          o _score: 1
          o _source: {
              + id: 1
              + name: tata
            }
        }
      * {
          o _index: radio
          o _type: artists
          o _id: 2
          o _score: 1
          o _source: {
              + id: 2
              + name: test
            }



    i can't get any results, here what i tried :

    hive> CREATE EXTERNAL TABLE artists2 (
         >
         >     id      BIGINT,
         >
         >     name    STRING)
         >
         >    STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
         >
         > TBLPROPERTIES('es.resource' = 'radio/artists/2');

    select * from artists2;
    Failed with exception java.io.IOException:java.lang.__IllegalStateException: [GET] on
    [radio/artists/2/_search___shards] failed; server[http://localhost:9200] returned [No handler found for uri
    [radio/artists/2/_search___shards] and method [GET]]
    Time taken: 0.223 seconds





    hive> CREATE EXTERNAL TABLE artists (
         >
         >     id      BIGINT,
         >
         >     name    STRING)
         >
         >    STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
         >
         > TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*')__;


    hive> select * from artists;
    OK
    Time taken: 0.244 seconds


    Regards,

    Fabien


    2013/6/25 Abhishek <abhis...@gmail.com>

        Fabien,

        The mapping I'm taking about is elastic search mapping it's like ES index schema.

        Your external table structure has three columns and your select statement has only two fields name and id.
        Try select star(all) or change external table schema to just name and id. You don't need to worry about the
        mappings.

        Thanks.
        Abhishek

        Sent from my iPhone

        On Jun 25, 2013, at 11:39 AM, Fabien Chung <chung....@gmail.com> wrote:

        Hi all,

        I'm encountering some troubles to understand what i should do to push data from Hive to ES.

        I have installed the elasticsearch pluggin, ES, HIVE and so on.

        I folow instructions from
        :https://github.com/__elasticsearch/elasticsearch-__hadoop#configuration-__properties
        <https://github.com/elasticsearch/elasticsearch-hadoop#configuration-properties>

        CREATE  EXTERNAL  TABLE  artists  (
             id       BIGINT,
             name     STRING,
             links    STRUCT<url:STRING,  picture:STRING>)
        STORED  BY  'org.elasticsearch.hadoop.__hive.ESStorageHandler'
        TBLPROPERTIES('es.resource'  =  __'radio/artists/');


        That sounds work properly.

        I created the index artist on Elasticsearch.

        But, in fine, this command doesn't work :

        INSERT OVERWRITE TABLE Darty_Mapping
            SELECT id, name FROM artist_hive;

        (artist_hive has the same structure as artist, but it's a local hive table)

        I have this error :

        Total MapReduce jobs = 1
        Launching Job 1 out of 1
        Number of reduce tasks is set to 0 since there's no reduce operator
        Starting Job = job_201306201709_0026, Tracking URL =
        http://hdpnoddev1.intranet.__darty.fr:50030/jobdetails.jsp?__jobid=job_201306201709_0026
        <http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026>
        Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201306201709_0026
        Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
        2013-06-25 17:00:14,410 Stage-0 map = 0%,  reduce = 0%
        2013-06-25 17:00:37,564 Stage-0 map = 100%,  reduce = 100%
        Ended Job = job_201306201709_0026 with errors
        Error during job, obtaining debugging information...
        Job Tracking URL: http://hdpnoddev1.intranet.__darty.fr:50030/jobdetails.jsp?__jobid=job_201306201709_0026
        <http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026>
        Examining task ID: task_201306201709_0026_m___000002 (and more) from job job_201306201709_0026

        Task with the most failures(4):
        -----
        Task ID:
          task_201306201709_0026_m___000000

        URL:
        http://hdpnoddev1.intranet.__darty.fr:50030/taskdetails.__jsp?jobid=job_201306201709___0026&tipid=task_201306201709___0026_m_000000
        <http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000>
        -----
        Diagnostic Messages for this Task:
        java.lang.RuntimeException: Error in configuring object
                at org.apache.hadoop.util.__ReflectionUtils.setJobConf(__ReflectionUtils.java:106)
                at org.apache.hadoop.util.__ReflectionUtils.setConf(__ReflectionUtils.java:72)
                at org.apache.hadoop.util.__ReflectionUtils.newInstance(__ReflectionUtils.java:130)
                at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:413)
                at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:332)
                at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                at java.security.__AccessController.doPrivileged(__Native Method)
                at javax.security.auth.Subject.__doAs(Subject.java:396)
                at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                at org.apache.hadoop.mapred.__Child.main(Child.java:262)
        Caused by: java.lang.reflect.__InvocationTargetException
                at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
                at sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
                at sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__ja

        FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.__exec.MapRedTask
        MapReduce Jobs Launched:
        Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
        Total MapReduce CPU Time Spent: 0 msec


        Abhishek speaks about a mapping table ??? what is it exactly ?


        Any help will be kindly appreciated,

        Thanks,
        Fabien

        PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a silly question :)



        --
        You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
        To unsubscribe from this topic, visit
        https://groups.google.com/d/__topic/elasticsearch/__BAaoqF6SkiY/unsubscribe
        <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe>.
        To unsubscribe from this group and all its topics, send an email to elasticsearc...@__googlegroups.com.
        For more options, visit https://groups.google.com/__groups/opt_out <https://groups.google.com/groups/opt_out>.

        --
        You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
        To unsubscribe from this topic, visit
        https://groups.google.com/d/__topic/elasticsearch/__BAaoqF6SkiY/unsubscribe
        <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe>.
        To unsubscribe from this group and all its topics, send an email to elasticsearc...@__googlegroups.com.
        For more options, visit https://groups.google.com/__groups/opt_out <https://groups.google.com/groups/opt_out>.





    --
    Chung Fabien



    EFREI Promo 2013
    Tel : 06 48 03 54 92 <tel:06%2048%2003%2054%2092>

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Chung Fabien

Consultant Junior YSANCE
Tel : +33 6 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/541AC085.4050505%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
How to push data from Hadoop to ES? Elasticsearch es-hadoop	6	4156	July 21, 2017
Insert data into Elasticsearch from Hive in real-time Elasticsearch es-hadoop	4	2262	July 6, 2017
ES HADOOP(7.9.0) , ELASTICSEARCH(7.9.0), HIVE(3.1.2) Elasticsearch es-hadoop	1	730	March 5, 2021
Hive external table automatically send data to elasticsearch Elasticsearch es-hadoop	2	851	July 6, 2017
Hive overwhelming Elasticsearch Elasticsearch es-hadoop	24	1433	May 18, 2021

Pushing data from Hive to Elastic Search

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.
jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Pushing data from Hive to Elastic Search

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

URL: http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Task with the most failures(4):

URL: http://hdpnoddev1.intranet.darty.fr:50030/taskdetails. jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Related topics

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.
jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000