Pushing data from Hive to Elastic Search

Hi All,

I'm trying to push data from hive to elastic search using external tables (
https://github.com/elasticsearch/elasticsearch-hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not arrays
    and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to the
    Hadoop env to talk to Hadoop and the reason for that is to take advantage
    of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel capabilities
    of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external tables
( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not arrays
    and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(Native Method)
at java.util.ArrayList.toArray(ArrayList.java:306)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
                          at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
                          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
                          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
                          at java.security.AccessController.doPrivileged(Native Method)
                          at javax.security.auth.Subject.doAs(Subject.java:396)
                          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
                          at org.apache.hadoop.mapred.Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(Native Method)
                          at java.util.ArrayList.toArray(ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
                          at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to the
    Hadoop env to talk to Hadoop and the reason for that is to take advantage
    of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external tables
( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not arrays
    and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Looks like an error in ESSerDe for which I've raised an issue:

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<rid:int,mapids:array,rdate:string,rdata:map<int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.local:8020/user/hive/warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(Native Method)
at java.util.ArrayList.toArray(ArrayList.java:306)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
                          at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
                          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
                          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
                          at java.security.AccessController.doPrivileged(Native Method)
                          at javax.security.auth.Subject.doAs(Subject.java:396)
                          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
                          at org.apache.hadoop.mapred.Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"rdate":"1234","rdata":{5:"8",6:"9"}}
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(Native Method)
                          at java.util.ArrayList.toArray(ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.ESSerDe.hiveToWritable(ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.ESSerDe.serialize(ESSerDe.java:109)
                          at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:586)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
                          at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
                          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
                          at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to the
    Hadoop env to talk to Hadoop and the reason for that is to take advantage
    of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data is
not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop )

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not
    arrays and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costin.leau@gmail.com wrote:

Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to the
    Hadoop env to talk to Hadoop and the reason for that is to take advantage
    of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoophttps://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not
    arrays and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau <costi...@gmail.com<javascript:>

wrote:

Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to
    the Hadoop env to talk to Hadoop and the reason for that is to take
    advantage of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoophttps://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not
    arrays and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can es
    storage handler delete the existing es document and insert the new/ updated
    doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Costin, Thanks. Its work great. Only problem I see is if the map key/value
and array data type is int. I see random values in ES. Works great with
Strings. I know I can force the mapping on the ES side to be int but just
wondering if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:

The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costi...@gmail.com wrote:

Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu
wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to
    the Hadoop env to talk to Hadoop and the reason for that is to take
    advantage of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks) data
is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoophttps://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not
    arrays and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can
    es storage handler delete the existing es document and insert the new/
    updated doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Costin,

integer data type doesn't work at all. I've added logs to
BufferedRestClient addtoIndex method. The first log is the writable object.
toString and the second log is the mapper write value as string. Integer
data type used to be fine before this commit though.

2013-04-30 09:57:11,466 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=[B@1a3650ed, rdata={[B@4e0a2a38=8, [B@7d59ea8e=9}, rdate=1234, mapids=[[B@63fb050c, [B@75088a1b, [B@3a32ea4]}

2013-04-30 09:57:11,536 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: ES index query{"index":{}}
{"rid":"AAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","rdata":{"[B@4e0a2a38":"8","[B@7d59ea8e":"9"},"rdate":"1234","mapids":["AAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="]}

Please let me know if you need more information I can add logs.

Thanks,
Abhishek

On Monday, April 29, 2013 5:01:03 PM UTC-6, Abhishek Andhavarapu wrote:

Costin, Thanks. Its work great. Only problem I see is if the map key/value
and array data type is int. I see random values in ES. Works great with
Strings. I know I can force the mapping on the ES side to be int but just
wondering if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:

The issue has been fixed in master.

Cheers!

On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

Thanks Costin.

On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau costi...@gmail.comwrote:

Looks like an error in ESSerDe for which I've raised an issue:
Serialization bug in ESSerDe.hiveToWritable · Issue #39 · elastic/elasticsearch-hadoop · GitHub

On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu
wrote:

Thanks Costin for the reply. Here is the error.

2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/**user/hive/warehouse/maptest3
2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: dump TS struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:50,997 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initializing Self 3 MAP
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing Self 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Operator 0 TS initialized
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initializing children of 0 TS
2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing child 1 SEL
2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing Self 1 SEL
2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: SELECT struct<rid:int,mapids:array<**int>,rdate:string,rdata:map<**int,string>>
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Operator 1 SEL initialized
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initializing children of 1 SEL
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing child 2 FS
2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initializing Self 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Operator 2 FS initialized
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: Initialization Done 2 FS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: Initialization Done 1 SEL
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: Initialization Done 0 TS
2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Initialization Done 3 MAP
2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.**local:8020/user/hive/**warehouse/maptest3
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarding 1 rows
2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarding 1 rows
2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.**maptest1/000000_3
2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
Caused by: java.lang.ArrayStoreException
at java.lang.System.arraycopy(**Native Method)
at java.util.ArrayList.toArray(**ArrayList.java:306)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
... 9 more

                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 finished. closing...
                          2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: DESERIALIZE_ERRORS:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 forwarded 1 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 finished. closing...
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: 2 forwarded 0 rows
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.SelectOperator: 1 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.TableScanOperator: 0 Close done
                          2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.**exec.MapOperator: 3 Close done
                          2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                          2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.**TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                          2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.**Child: Error running child
                          java.lang.RuntimeException: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:161)
                          at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
                          at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:418)
                          at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:333)
                          at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
                          at java.security.**AccessController.doPrivileged(**Native Method)
                          at javax.security.auth.Subject.**doAs(Subject.java:396)
                          at org.apache.hadoop.security.**UserGroupInformation.doAs(**UserGroupInformation.java:**1408)
                          at org.apache.hadoop.mapred.**Child.main(Child.java:262)
                          Caused by: org.apache.hadoop.hive.ql.**metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"**rdate":"1234","rdata":{5:"8",**6:"9"}}
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:565)
                          at org.apache.hadoop.hive.ql.**exec.ExecMapper.map(**ExecMapper.java:143)
                          ... 8 more
                          Caused by: java.lang.ArrayStoreException
                          at java.lang.System.arraycopy(**Native Method)
                          at java.util.ArrayList.toArray(**ArrayList.java:306)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:136)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.hiveToWritable(**ESSerDe.java:197)
                          at org.elasticsearch.hadoop.hive.**ESSerDe.serialize(ESSerDe.**java:109)
                          at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**processOp(FileSinkOperator.**java:586)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.SelectOperator.processOp(**SelectOperator.java:84)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.TableScanOperator.**processOp(TableScanOperator.**java:83)
                          at org.apache.hadoop.hive.ql.**exec.Operator.process(**Operator.java:474)
                          at org.apache.hadoop.hive.ql.**exec.Operator.forward(**Operator.java:800)
                          at org.apache.hadoop.hive.ql.**exec.MapOperator.process(**MapOperator.java:546)
                          ... 9 more
                          2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Thanks,

On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

Hi,

  1. What's the problem? Any error message that you receive? Except for
    UNIONs, Arrays (or List) as well as Map should work.
  2. ES-Hadoop integration sits outside ES. It just something added to
    the Hadoop env to talk to Hadoop and the reason for that is to take
    advantage of the map/reduce capabilities which map nicely on top of ES.
    A river or a single-instance process would render the parallel
    capabilities of Hadoop void.
  3. Hive doesn't support any UPDATE statement - just INSERT and INSERT
    OVERWRITE which doesn't really apply here since it's an external table. We
    might extend INSERT OVERWRITE semantics but that is tricky since it
    requires the notion of ID - typically insert overwrite is the equivalent of
    dropping a table and then adding data into it, which is clearly not an
    update.
    You are better off handling the UPDATE directly in ES.

Note that in Hive (as with the rest of the map/reduce frameworks)
data is not updated, but rather copied and transformed.

Cheers,

On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu
wrote:

Hi All,

I'm trying to push data from hive to Elasticsearch using external
tables ( https://github.com/**elasticsearch/elasticsearch-**hadoophttps://github.com/elasticsearch/elasticsearch-hadoop
)

My ES index mapping

{
"rid": 1,
"mapids" : [2,3,4], //Array
"data": [ //Nested objects
{
"mapid": "5",
"value": "g1"
},
{
"mapid": "6",
"value": "g2"
}
]
}

My Hive table structure

CREATE EXTERNAL TABLE maptest_ex(
rid INT,
mapids ARRAY,
rdata MAP<INT,STRING>)
STORED BY 'org.elasticsearch.hadoop.**hive.ESStorageHandler'
TBLPROPERTIES(
'es.host' = 'elasticsearch1',
'es.resource' = 'radio/artists/')

and I'm trying to push data from local hive table to the external
table

insert into table maptest_ex
select rid,mapids,rdata from maptest3

  1. The push works for simple data type like int and string but not
    arrays and maps. How do I push data from Hive to ES.
  2. Is a Hive river I could use ?
  3. How do I update the document in es? (If a row already exists can
    es storage handler delete the existing es document and insert the new/
    updated doc)

Any help is appreciated,

Thanks

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Abhishek,

I've fixed this in master and pushed a snapshot with the fix [1]. Let me know how it works for you.

Cheers,

[1] http://build.elasticsearch.org/browse/ESHADOOP-NIGHTLY-36

On 4/30/2013 7:00 PM, Abhishek Andhavarapu wrote:

Costin,

integer data type doesn't work at all. I've added logs to BufferedRestClient addtoIndex method. The first log is the
writable object. toString and the second log is the mapper write value as string. Integer data type used to be fine
before this commit though.

2013-04-30 09:57:11,466 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=[B@1a3650ed, rdata={[B@4e0a2a38=8, [B@7d59ea8e=9}, rdate=1234, mapids=[[B@63fb050c, [B@75088a1b, [B@3a32ea4]}

2013-04-30 09:57:11,536 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: ES index query{"index":{}}
                           {"rid":"AAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","rdata":{"[B@4e0a2a38":"8","[B@7d59ea8e":"9"},"rdate":"1234","mapids":["AAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="]}

Please let me know if you need more information I can add logs.

Thanks,
Abhishek

On Monday, April 29, 2013 5:01:03 PM UTC-6, Abhishek Andhavarapu wrote:

Costin, Thanks. Its work great. Only problem I see is if the map key/value and array data  type is int. I see random
values in ES. Works great with Strings. I know I can force the mapping on the ES side to be int but just wondering
if its a simple fix.

On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:

    The issue has been fixed in master.

    Cheers!

    On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:

        Thanks Costin.


        On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau <costi...@gmail.com> wrote:

            Looks like an error in ESSerDe for which I've raised an issue:
            https://github.com/elasticsearch/elasticsearch-hadoop/issues/39
            <https://github.com/elasticsearch/elasticsearch-hadoop/issues/39>


            On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:

                Thanks Costin for the reply. Here is the error.

                   2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/__user/hive/warehouse/maptest3
                                               2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: dump TS struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:50,997 INFO ExecMapper:
                                               <MAP>Id =3
                                               <Children>
                                               <TS>Id =0
                                               <Children>
                                               <SEL>Id =1
                                               <Children>
                                               <FS>Id =2
                                               <Parent>Id = 1 null<\Parent>
                                               <\FS>
                                               <\Children>
                                               <Parent>Id = 0 null<\Parent>
                                               <\SEL>
                                               <\Children>
                                               <Parent>Id = 3 null<\Parent>
                                               <\TS>
                                               <\Children>
                                               <\MAP>
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initializing Self 3 MAP
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing Self 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Operator 0 TS initialized
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing children of 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing child 1 SEL
                                               2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing Self 1 SEL
                                               2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: SELECT struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Operator 1 SEL initialized
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing children of 1 SEL
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing child 2 FS
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing Self 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Operator 2 FS initialized
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initialization Done 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initialization Done 1 SEL
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initialization Done 0 TS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initialization Done 3 MAP
                                               2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.__local:8020/user/hive/__warehouse/maptest3
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarding 1 rows
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.__maptest1/000000_3
                                               2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more

                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 finished. closing...
                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: DESERIALIZE_ERRORS:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 forwarded 0 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 Close done
                                               2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                                               2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.__TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                                               2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.__Child: Error running child
                                               java.lang.RuntimeException: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:161)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               ... 8 more
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more
                                               2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task


                Thanks,


                On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:

                    Hi,

                    1) What's the problem? Any error message that you receive? Except for UNIONs, Arrays (or List)
                    as well as Map should work.
                    2) ES-Hadoop integration sits outside ES. It just something added to the Hadoop env to talk to
                    Hadoop and the reason for that is to take advantage of the map/reduce capabilities which map
                    nicely on top of ES.
                    A river or a single-instance process would render the parallel capabilities of Hadoop void.
                    3) Hive doesn't support any UPDATE statement - just INSERT and INSERT OVERWRITE which doesn't
                    really apply here since it's an external table. We might extend INSERT OVERWRITE semantics but
                    that is tricky since it requires the notion of ID - typically insert overwrite is the equivalent
                    of dropping a table and then adding data into it, which is clearly not an update.
                    You are better off handling the UPDATE directly in ES.

                    Note that in Hive (as with the rest of the map/reduce frameworks) data is not updated, but
                    rather copied and transformed.

                    Cheers,

                    On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:

                        Hi All,

                        I'm trying to push data from hive to elastic search using external tables (
                        https://github.com/__elasticsearch/elasticsearch-__hadoop
                        <https://github.com/elasticsearch/elasticsearch-hadoop> )

                        My ES index mapping

                        {
                           "rid": 1,
                           "mapids" : [2,3,4], //Array
                           "data": [ //Nested objects
                             {
                               "mapid": "5",
                               "value": "g1"
                             },
                        {
                               "mapid": "6",
                               "value": "g2"
                             }
                           ]
                        }

                        My Hive table structure

                        CREATE EXTERNAL TABLE maptest_ex(
                             rid      INT,
                             mapids  ARRAY<INT>,
                             rdata     MAP<INT,STRING>)
                        STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
                        TBLPROPERTIES(
                        'es.host' = 'elasticsearch1',
                        'es.resource' = 'radio/artists/')

                        and I'm trying to push data from local hive table to the external table

                        insert into table maptest_ex
                           select rid,mapids,rdata from maptest3

                        1) The push works for simple data type like int and string but not arrays and maps. How do I
                        push data from Hive to ES.
                        2) Is a Hive river I could use ?
                        3) How do I update the document in es? (If a row already exists can es storage handler
                        delete the existing es document and insert the new/ updated doc)

                        Any help is appreciated,

                        Thanks

            --
            You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
            To unsubscribe from this topic, visit
            https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
            <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US>.
            To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.
            For more options, visit https://groups.google.com/groups/opt_out <https://groups.google.com/groups/opt_out>.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi all,

I'm encountering some troubles to understand what i should do to push data
from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question :slight_smile:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index schema.

Your external table structure has three columns and your select statement has only two fields name and id. Try select star(all) or change external table schema to just name and id. You don't need to worry about the mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung.fabien@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' = 'radio/artists/');
That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL = http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a silly question :slight_smile:

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

  • {
    • _index: radio
    • _type: artists
    • _id: 1
    • _score: 1
    • _source: {
      • id: 1
      • name: tata
        }
        }
  • {
    • _index: radio
    • _type: artists
    • _id: 2
    • _score: 1
    • _source: {
      • id: 2
      • name: test
        }

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek abhishek376@gmail.com

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index
schema.

Your external table structure has three columns and your select statement
has only two fields name and id. Try select star(all) or change external
table schema to just name and id. You don't need to worry about the
mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung.fabien@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push data
from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :
GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question :slight_smile:

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Fabien,
I also getting the same error message. can you please tell me what is the
solution for it if you have get rid of this error.?
thanks in advance
regards
Mohit Kumar Yadav

On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

  • {
    • _index: radio
    • _type: artists
    • _id: 1
    • _score: 1
    • _source: {
      • id: 1
      • name: tata
        }
        }
  • {
    • _index: radio
    • _type: artists
    • _id: 2
    • _score: 1
    • _source: {
      • id: 2
      • name: test
        }

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek <abhis...@gmail.com <javascript:>>

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES index
schema.

Your external table structure has three columns and your select statement
has only two fields name and id. Try select star(all) or change external
table schema to just name and id. You don't need to worry about the
mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung <chung....@gmail.com
<javascript:>> wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push
data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :
GitHub - elastic/elasticsearch-hadoop: Elasticsearch real-time search and analytics natively integrated with Hadoop

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive
table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:

http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a
silly question :slight_smile:

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

I would be glad to help you unfortunatly I stop for 1 year using ES and I
can't remember how I solved this issue.

Sorry,

Fabien

2014-09-18 11:05 GMT+02:00 Mohit Kumar Yadav mohit.kumar.ngi@gmail.com:

Hi Fabien,
I also getting the same error message. can you please tell me what is the
solution for it if you have get rid of this error.?
thanks in advance
regards
Mohit Kumar Yadav

On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

Hi,

thanks for your answer. Sorry I didn't copy the right command line.

Any way I still have the same probleme to read write on ES from hive.

in ES :

  • {
    • _index: radio
    • _type: artists
    • _id: 1
    • _score: 1
    • _source: {
      • id: 1
      • name: tata
        }
        }
  • {
    • _index: radio
    • _type: artists
    • _id: 2
    • _score: 1
    • _source: {
      • id: 2
      • name: test
        }

i can't get any results, here what i tried :

hive> CREATE EXTERNAL TABLE artists2 (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/2');

select * from artists2;
Failed with exception java.io.IOException:java.lang.IllegalStateException:
[GET] on [radio/artists/2/_search_shards] failed; server[
http://localhost:9200] returned [No handler found for uri
[radio/artists/2/_search_shards] and method [GET]]
Time taken: 0.223 seconds

hive> CREATE EXTERNAL TABLE artists (
>
> id BIGINT,
>
> name STRING)
>
> STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
>
> TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*');

hive> select * from artists;
OK
Time taken: 0.244 seconds

Regards,

Fabien

2013/6/25 Abhishek abhis...@gmail.com

Fabien,

The mapping I'm taking about is Elasticsearch mapping it's like ES
index schema.

Your external table structure has three columns and your select
statement has only two fields name and id. Try select star(all) or change
external table schema to just name and id. You don't need to worry about
the mappings.

Thanks.
Abhishek

Sent from my iPhone

On Jun 25, 2013, at 11:39 AM, Fabien Chung chung....@gmail.com wrote:

Hi all,

I'm encountering some troubles to understand what i should do to push
data from Hive to ES.

I have installed the elasticsearch pluggin, ES, HIVE and so on.

I folow instructions from :https://github.com/
elasticsearch/elasticsearch-hadoop#configuration-properties

CREATE EXTERNAL TABLE artists (
id BIGINT,
name STRING,
links STRUCT<url:STRING, picture:STRING>)STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'

TBLPROPERTIES('es.resource' = 'radio/artists/');

That sounds work properly.

I created the index artist on Elasticsearch.

But, in fine, this command doesn't work :

INSERT OVERWRITE TABLE Darty_Mapping
SELECT id, name FROM artist_hive;

(artist_hive has the same structure as artist, but it's a local hive
table)

I have this error :

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306201709_0026, Tracking URL =
http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?
jobid=job_201306201709_0026
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill
job_201306201709_0026
Hadoop job information for Stage-0: number of mappers: 1; number of
reducers: 0
2013-06-25 17:00:14,410 Stage-0 map = 0%, reduce = 0%
2013-06-25 17:00:37,564 Stage-0 map = 100%, reduce = 100%
Ended Job = job_201306201709_0026 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://hdpnoddev1.intranet.
darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026
Examining task ID: task_201306201709_0026_m_000002 (and more) from job
job_201306201709_0026

Task with the most failures(4):

Task ID:
task_201306201709_0026_m_000000

URL:
http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.
jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000

Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
ReflectionUtils.java:106)
at org.apache.hadoop.util.ReflectionUtils.setConf(
ReflectionUtils.java:72)
at org.apache.hadoop.util.ReflectionUtils.newInstance(
ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.
java:413)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.ja

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Abhishek speaks about a mapping table ??? what is it exactly ?

Any help will be kindly appreciated,

Thanks,
Fabien

PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's
a silly question :slight_smile:

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Chung Fabien

EFREI Promo 2013
Tel : 06 48 03 54 92

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Chung Fabien

Consultant Junior YSANCE
Tel : +33 6 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

It's quite easy - the es.resource format is incorrect. It should be index/type as in radio/artists
If you want/need to specify an ID or other criteria, you should do so in the query (through es.query).

On 9/18/14 12:10 PM, Fabien Chung wrote:

Hi,

I would be glad to help you unfortunatly I stop for 1 year using ES and I can't remember how I solved this issue.

Sorry,

Fabien

2014-09-18 11:05 GMT+02:00 Mohit Kumar Yadav <mohit.kumar.ngi@gmail.com mailto:mohit.kumar.ngi@gmail.com>:

Hi Fabien,
I also getting the same error message. can you please tell me what is the solution for it if you have get rid of
this error.?
thanks in advance
regards
Mohit Kumar Yadav


On Wednesday, June 26, 2013 2:59:54 PM UTC+5:30, Fabien Chung wrote:

    Hi,

    thanks for your answer. Sorry I didn't copy the right command line.

    Any way I still have the same probleme to read write on ES from hive.

    in ES :

      * {
          o _index: radio
          o _type: artists
          o _id: 1
          o _score: 1
          o _source: {
              + id: 1
              + name: tata
            }
        }
      * {
          o _index: radio
          o _type: artists
          o _id: 2
          o _score: 1
          o _source: {
              + id: 2
              + name: test
            }



    i can't get any results, here what i tried :

    hive> CREATE EXTERNAL TABLE artists2 (
         >
         >     id      BIGINT,
         >
         >     name    STRING)
         >
         >    STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
         >
         > TBLPROPERTIES('es.resource' = 'radio/artists/2');

    select * from artists2;
    Failed with exception java.io.IOException:java.lang.__IllegalStateException: [GET] on
    [radio/artists/2/_search___shards] failed; server[http://localhost:9200] returned [No handler found for uri
    [radio/artists/2/_search___shards] and method [GET]]
    Time taken: 0.223 seconds





    hive> CREATE EXTERNAL TABLE artists (
         >
         >     id      BIGINT,
         >
         >     name    STRING)
         >
         >    STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
         >
         > TBLPROPERTIES('es.resource' = 'radio/artists/_search?q=me*')__;


    hive> select * from artists;
    OK
    Time taken: 0.244 seconds


    Regards,

    Fabien


    2013/6/25 Abhishek <abhis...@gmail.com>

        Fabien,

        The mapping I'm taking about is elastic search mapping it's like ES index schema.

        Your external table structure has three columns and your select statement has only two fields name and id.
        Try select star(all) or change external table schema to just name and id. You don't need to worry about the
        mappings.

        Thanks.
        Abhishek

        Sent from my iPhone

        On Jun 25, 2013, at 11:39 AM, Fabien Chung <chung....@gmail.com> wrote:
        Hi all,

        I'm encountering some troubles to understand what i should do to push data from Hive to ES.

        I have installed the elasticsearch pluggin, ES, HIVE and so on.

        I folow instructions from
        :https://github.com/__elasticsearch/elasticsearch-__hadoop#configuration-__properties
        <https://github.com/elasticsearch/elasticsearch-hadoop#configuration-properties>

        CREATE  EXTERNAL  TABLE  artists  (
             id       BIGINT,
             name     STRING,
             links    STRUCT<url:STRING,  picture:STRING>)
        STORED  BY  'org.elasticsearch.hadoop.__hive.ESStorageHandler'
        TBLPROPERTIES('es.resource'  =  __'radio/artists/');


        That sounds work properly.

        I created the index artist on Elasticsearch.

        But, in fine, this command doesn't work :

        INSERT OVERWRITE TABLE Darty_Mapping
            SELECT id, name FROM artist_hive;

        (artist_hive has the same structure as artist, but it's a local hive table)

        I have this error :

        Total MapReduce jobs = 1
        Launching Job 1 out of 1
        Number of reduce tasks is set to 0 since there's no reduce operator
        Starting Job = job_201306201709_0026, Tracking URL =
        http://hdpnoddev1.intranet.__darty.fr:50030/jobdetails.jsp?__jobid=job_201306201709_0026
        <http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026>
        Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201306201709_0026
        Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
        2013-06-25 17:00:14,410 Stage-0 map = 0%,  reduce = 0%
        2013-06-25 17:00:37,564 Stage-0 map = 100%,  reduce = 100%
        Ended Job = job_201306201709_0026 with errors
        Error during job, obtaining debugging information...
        Job Tracking URL: http://hdpnoddev1.intranet.__darty.fr:50030/jobdetails.jsp?__jobid=job_201306201709_0026
        <http://hdpnoddev1.intranet.darty.fr:50030/jobdetails.jsp?jobid=job_201306201709_0026>
        Examining task ID: task_201306201709_0026_m___000002 (and more) from job job_201306201709_0026

        Task with the most failures(4):
        -----
        Task ID:
          task_201306201709_0026_m___000000

        URL:
        http://hdpnoddev1.intranet.__darty.fr:50030/taskdetails.__jsp?jobid=job_201306201709___0026&tipid=task_201306201709___0026_m_000000
        <http://hdpnoddev1.intranet.darty.fr:50030/taskdetails.jsp?jobid=job_201306201709_0026&tipid=task_201306201709_0026_m_000000>
        -----
        Diagnostic Messages for this Task:
        java.lang.RuntimeException: Error in configuring object
                at org.apache.hadoop.util.__ReflectionUtils.setJobConf(__ReflectionUtils.java:106)
                at org.apache.hadoop.util.__ReflectionUtils.setConf(__ReflectionUtils.java:72)
                at org.apache.hadoop.util.__ReflectionUtils.newInstance(__ReflectionUtils.java:130)
                at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:413)
                at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:332)
                at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                at java.security.__AccessController.doPrivileged(__Native Method)
                at javax.security.auth.Subject.__doAs(Subject.java:396)
                at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                at org.apache.hadoop.mapred.__Child.main(Child.java:262)
        Caused by: java.lang.reflect.__InvocationTargetException
                at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
                at sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
                at sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__ja

        FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.__exec.MapRedTask
        MapReduce Jobs Launched:
        Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
        Total MapReduce CPU Time Spent: 0 msec


        Abhishek speaks about a mapping table ??? what is it exactly ?


        Any help will be kindly appreciated,

        Thanks,
        Fabien

        PS: i'm a really beginner on Linux / Hadoop environnment, sorry if it's a silly question :)



        --
        You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
        To unsubscribe from this topic, visit
        https://groups.google.com/d/__topic/elasticsearch/__BAaoqF6SkiY/unsubscribe
        <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe>.
        To unsubscribe from this group and all its topics, send an email to elasticsearc...@__googlegroups.com.
        For more options, visit https://groups.google.com/__groups/opt_out <https://groups.google.com/groups/opt_out>.
        --
        You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
        To unsubscribe from this topic, visit
        https://groups.google.com/d/__topic/elasticsearch/__BAaoqF6SkiY/unsubscribe
        <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe>.
        To unsubscribe from this group and all its topics, send an email to elasticsearc...@__googlegroups.com.
        For more options, visit https://groups.google.com/__groups/opt_out <https://groups.google.com/groups/opt_out>.





    --
    Chung Fabien



    EFREI Promo 2013
    Tel : 06 48 03 54 92 <tel:06%2048%2003%2054%2092>

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/9d8cb3ad-3fbf-465b-b167-33b8e9fd6ff0%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Chung Fabien

Consultant Junior YSANCE
Tel : +33 6 48 03 54 92

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2BtyE3kLEt-MAmGhH4sRHNZta220bzJGEz%2Bc%3D7Tdigikm4qjSg%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/541AC085.4050505%40gmail.com.
For more options, visit https://groups.google.com/d/optout.