I've fixed this in master and pushed a snapshot with the fix [1]. Let me know how it works for you.
Costin,
integer data type doesn't work at all. I've added logs to BufferedRestClient addtoIndex method. The first log is the
writable object. toString and the second log is the mapper write value as string. Integer data type used to be fine
before this commit though.
2013-04-30 09:57:11,466 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=[B@1a3650ed, rdata={[B@4e0a2a38=8, [B@7d59ea8e=9}, rdate=1234, mapids=[[B@63fb050c, [B@75088a1b, [B@3a32ea4]}
2013-04-30 09:57:11,536 INFO org.elasticsearch.hadoop.rest.BufferedRestClient: ES index query{"index":{}}
                           {"rid":"AAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","rdata":{"[B@4e0a2a38":"8","[B@7d59ea8e":"9"},"rdate":"1234","mapids":["AAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=","AAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="]}
Please let me know if you need more information I can add logs.
Thanks,
Abhishek
On Monday, April 29, 2013 5:01:03 PM UTC-6, Abhishek Andhavarapu wrote:
Costin, Thanks. Its work great. Only problem I see is if the map key/value and array data  type is int. I see random
values in ES. Works great with Strings. I know I can force the mapping on the ES side to be int but just wondering
if its a simple fix.
On Monday, April 29, 2013 11:35:12 AM UTC-6, Costin Leau wrote:
    The issue has been fixed in master.
    Cheers!
    On Thursday, April 25, 2013 7:22:57 PM UTC+3, Abhishek Andhavarapu wrote:
        Thanks Costin.
        On Thu, Apr 25, 2013 at 10:20 AM, Costin Leau <costi...@gmail.com> wrote:
            Looks like an error in ESSerDe for which I've raised an issue:
            https://github.com/elasticsearch/elasticsearch-hadoop/issues/39
            <https://github.com/elasticsearch/elasticsearch-hadoop/issues/39>
            On Wednesday, April 24, 2013 6:25:35 PM UTC+2, Abhishek Andhavarapu wrote:
                Thanks Costin for the reply. Here is the error.
                   2013-04-24 10:15:50,990 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Adding alias maptest3 to work list for file hdfs://hadoop1.local:8020/__user/hive/warehouse/maptest3
                                               2013-04-24 10:15:50,996 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: dump TS struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:50,997 INFO ExecMapper:
                                               <MAP>Id =3
                                               <Children>
                                               <TS>Id =0
                                               <Children>
                                               <SEL>Id =1
                                               <Children>
                                               <FS>Id =2
                                               <Parent>Id = 1 null<\Parent>
                                               <\FS>
                                               <\Children>
                                               <Parent>Id = 0 null<\Parent>
                                               <\SEL>
                                               <\Children>
                                               <Parent>Id = 3 null<\Parent>
                                               <\TS>
                                               <\Children>
                                               <\MAP>
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initializing Self 3 MAP
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing Self 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Operator 0 TS initialized
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initializing children of 0 TS
                                               2013-04-24 10:15:50,997 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing child 1 SEL
                                               2013-04-24 10:15:50,998 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing Self 1 SEL
                                               2013-04-24 10:15:51,008 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: SELECT struct<rid:int,mapids:array<__int>,rdate:string,rdata:map<__int,string>>
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Operator 1 SEL initialized
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initializing children of 1 SEL
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing child 2 FS
                                               2013-04-24 10:15:51,012 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initializing Self 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Operator 2 FS initialized
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: Initialization Done 2 FS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: Initialization Done 1 SEL
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: Initialization Done 0 TS
                                               2013-04-24 10:15:51,031 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Initialization Done 3 MAP
                                               2013-04-24 10:15:51,039 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: Processing alias maptest3 for file hdfs://hadoop1.allegiance.__local:8020/user/hive/__warehouse/maptest3
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarding 1 rows
                                               2013-04-24 10:15:51,040 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarding 1 rows
                                               2013-04-24 10:15:51,043 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: New Final Path: FS /user/hive/warehouse/_tmp.__maptest1/000000_3
                                               2013-04-24 10:15:51,422 FATAL ExecMapper: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more
                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 finished. closing...
                                               2013-04-24 10:15:51,422 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: DESERIALIZE_ERRORS:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 forwarded 1 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 finished. closing...
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: 2 forwarded 0 rows
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:0
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.SelectOperator: 1 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.TableScanOperator: 0 Close done
                                               2013-04-24 10:15:51,423 INFO org.apache.hadoop.hive.ql.__exec.MapOperator: 3 Close done
                                               2013-04-24 10:15:51,423 INFO ExecMapper: ExecMapper: processed 0 rows: used memory = 23614288
                                               2013-04-24 10:15:51,435 INFO org.apache.hadoop.mapred.__TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
                                               2013-04-24 10:15:51,439 WARN org.apache.hadoop.mapred.__Child: Error running child
                                               java.lang.RuntimeException: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:161)
                                               at org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
                                               at org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:418)
                                               at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:333)
                                               at org.apache.hadoop.mapred.__Child$4.run(Child.java:268)
                                               at java.security.__AccessController.doPrivileged(__Native Method)
                                               at javax.security.auth.Subject.__doAs(Subject.java:396)
                                               at org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1408)
                                               at org.apache.hadoop.mapred.__Child.main(Child.java:262)
                                               Caused by: org.apache.hadoop.hive.ql.__metadata.HiveException: Hive Runtime Error while processing row {"rid":1,"mapids":[2,3,4],"__rdate":"1234","rdata":{5:"8",__6:"9"}}
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:565)
                                               at org.apache.hadoop.hive.ql.__exec.ExecMapper.map(__ExecMapper.java:143)
                                               ... 8 more
                                               Caused by: java.lang.ArrayStoreException
                                               at java.lang.System.arraycopy(__Native Method)
                                               at java.util.ArrayList.toArray(__ArrayList.java:306)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:136)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.hiveToWritable(__ESSerDe.java:197)
                                               at org.elasticsearch.hadoop.hive.__ESSerDe.serialize(ESSerDe.__java:109)
                                               at org.apache.hadoop.hive.ql.__exec.FileSinkOperator.__processOp(FileSinkOperator.__java:586)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.SelectOperator.processOp(__SelectOperator.java:84)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.TableScanOperator.__processOp(TableScanOperator.__java:83)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.process(__Operator.java:474)
                                               at org.apache.hadoop.hive.ql.__exec.Operator.forward(__Operator.java:800)
                                               at org.apache.hadoop.hive.ql.__exec.MapOperator.process(__MapOperator.java:546)
                                               ... 9 more
                                               2013-04-24 10:15:51,446 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
                Thanks,
                On Wednesday, April 24, 2013 12:44:03 AM UTC-6, Costin Leau wrote:
                    Hi,
                    1) What's the problem? Any error message that you receive? Except for UNIONs, Arrays (or List)
                    as well as Map should work.
                    2) ES-Hadoop integration sits outside ES. It just something added to the Hadoop env to talk to
                    Hadoop and the reason for that is to take advantage of the map/reduce capabilities which map
                    nicely on top of ES.
                    A river or a single-instance process would render the parallel capabilities of Hadoop void.
                    3) Hive doesn't support any UPDATE statement - just INSERT and INSERT OVERWRITE which doesn't
                    really apply here since it's an external table. We might extend INSERT OVERWRITE semantics but
                    that is tricky since it requires the notion of ID - typically insert overwrite is the equivalent
                    of dropping a table and then adding data into it, which is clearly not an update.
                    You are better off handling the UPDATE directly in ES.
                    Note that in Hive (as with the rest of the map/reduce frameworks) data is not updated, but
                    rather copied and transformed.
                    Cheers,
                    On Tuesday, April 23, 2013 11:25:37 PM UTC+2, Abhishek Andhavarapu wrote:
                        Hi All,
                        I'm trying to push data from hive to elastic search using external tables (
                        https://github.com/__elasticsearch/elasticsearch-__hadoop
                        <https://github.com/elasticsearch/elasticsearch-hadoop> )
                        My ES index mapping
                        {
                           "rid": 1,
                           "mapids" : [2,3,4], //Array
                           "data": [ //Nested objects
                             {
                               "mapid": "5",
                               "value": "g1"
                             },
                        {
                               "mapid": "6",
                               "value": "g2"
                             }
                           ]
                        }
                        My Hive table structure
                        CREATE EXTERNAL TABLE maptest_ex(
                             rid      INT,
                             mapids  ARRAY<INT>,
                             rdata     MAP<INT,STRING>)
                        STORED BY 'org.elasticsearch.hadoop.__hive.ESStorageHandler'
                        TBLPROPERTIES(
                        'es.host' = 'elasticsearch1',
                        'es.resource' = 'radio/artists/')
                        and I'm trying to push data from local hive table to the external table
                        insert into table maptest_ex
                           select rid,mapids,rdata from maptest3
                        1) The push works for simple data type like int and string but not arrays and maps. How do I
                        push data from Hive to ES.
                        2) Is a Hive river I could use ?
                        3) How do I update the document in es? (If a row already exists can es storage handler
                        delete the existing es document and insert the new/ updated doc)
                        Any help is appreciated,
                        Thanks
            --
            You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
            To unsubscribe from this topic, visit
            https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US
            <https://groups.google.com/d/topic/elasticsearch/BAaoqF6SkiY/unsubscribe?hl=en-US>.
            To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.
            For more options, visit https://groups.google.com/groups/opt_out <https://groups.google.com/groups/opt_out>.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.