Trouble to load data from my hadoop cluster to elasticsearch via pig and hive

I have some issues with my project and here I am seeking for some guidance
and help

here is the situation :

Pig :

I have a hadoop cluster managed by Cloudera CDH 5.3.

I have ElasticSearch 1.4.4 installed in my master machine(10.44.162.169)

I have downloaded the marvel plugin and so access to my ES via :
http://10.44.162.169:9200/_plugin/marvel/kibana/index.html#/dashboard/file/marvel.overview.json

I have created an index via sense named myindex with a type named mytype to
push my data in it later.

I did also install kibana 4 and changed the kibana.yml like this :

The host to bind the server to

host: "10.44.162.169"

The Elasticsearch instance to use for all your queries.

elasticsearch_url: "http://10.44.162.169:9200"

I access to it via port 5601 (10.44.162.169:5601)

Now I want to load a data I have in my hdfs into my ElasticSearch.

After dowloading the es-hadoop jar and adding it to the path.

This is how I proceeded :

REGISTER /usr/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-pig-2.0.2.jar

--load the CDR.csv file
cdr= LOAD '/user/omar/CDR.csv' using PigStorage(';')
AS TRAFFIC_TYPE_ID:int,APPELANT:int,CALLED_NUMBER:int,CALL_DURATION:int,LOCATION_NUMBER:chararray,DATE_HEURE_APPEL:chararray);

STORE cdr INTO 'myindex/mytype' USING org.elasticsearch.hadoop.pig.PigRunner.run('es.nodes'='10.44.162.169');

When I execute this; the job is a success !!!

BUT, nothing seems to appear in my ES !

  1. When I go and access to marvel, I don't find any documents in myindex !

2 )Neither in my Kibana plugin !

  1. Furthermore, when I want to consult the logs in the HUE, I can't find a thing!

    • Why data isn't pushed in my ES?
    • What should I do to visualize it?
    • Why is my created job a success but none log is there to see what's happening!

I then tried this way :

REGISTER /usr/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-pig-2.0.2.jar

--load the CDR.csv file
cdr= LOAD '/user/admin/CDR_OMAR.csv' using PigStorage(';')
AS
(traffic_type_id:int,caller:int,call_time:datetime,tranche_horaire:int,called:int,called:int,call_duration:int,code_type:chararray,code_destination:chararray,location:chararray,id_offre:int,id_service:int,date_heure_appel:chararray);

--STORE cdr INTO 'indexOmar/typeOmar' USING EsStorage('es.nodes'='0.44.162.169:9200')
STORE cdr INTO 'telecom/cdr' USING org.elasticsearch.hadoop.pig.EsStorage('es.nodes'='10.44.162.169',
'es.mapping.names=call_time:@timestamp',
'es.index.auto.create = false');

But, I got this error :

Run pig script using PigRunner.run() for Pig version 0.8+
2015-03-06 14:22:21,768 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.3.1 (rexported) compiled Jan 27 2015, 14:45:17
2015-03-06 14:22:21,770 [main] INFO org.apache.pig.Main - Logging error messages to: /yarn/nm/usercache/admin/appcache/application_1425457357655_0009/container_1425457357655_0009_01_000002/pig-job_1425457357655_0009.log
2015-03-06 14:22:21,863 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /var/lib/hadoop-yarn/.pigbootup not found

Any idea why this is happening and how to fix it?

Now the Hive issue :

I have downloaded the ES-Hadoop jar and added it to the path.

With that being said; I now want to load data from hive to ES.

  1. First of all, I created a table via a CSV file under table metastore(with HUE)

  2. I defined an external table on top of ES in hive to write and load data in it later:

ADD JAR

/usr/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-hive-2.0.2.jar;

CREATE EXTERNAL TABLE es_cdr(

id bigint,

calling int,

called int,

duration int,

location string,

date string)

ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'

STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES(

'es.nodes'='10.44.162.169',

'es.resource' = 'indexOmar/typeOmar');

I've also added manually the serde snapshot jar via paramaters=> add file =>jar

now, I want to load data from my table in the new ES table :

INSERT OVERWRITE TABLE es_cdr

select NULL, h.appelant, h.called_number, h.call_duration, h.location_number, h.date_heure_appel from hive_cdr h;

but an error is appearing saying that :

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

And this is what's written in the log :

15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=semanticAnalyze start=1425562594381 end=1425562594463 duration=82 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:34 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, type:bigint, comment:null), FieldSchema(name:_col1, type:int, comment:null), FieldSchema(name:_col2, type:int, comment:null), FieldSchema(name:_col3, type:int, comment:null), FieldSchema(name:_col4, type:string, comment:null), FieldSchema(name:_col5, type:string, comment:null)], properties:null)
15/03/05 14:36:34 INFO ql.Driver: EXPLAIN output for queryid hive_20150305143636_528f97d4-b670-40e2-ba80-7d7a7bd441ff : ABSTRACT SYNTAX TREE:

TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
hive_cdr
h
TOK_INSERT
TOK_DESTINATION
TOK_TAB
TOK_TABNAME
hive_es_cdr_10
TOK_SELECT
TOK_SELEXPR
TOK_NULL
TOK_SELEXPR
.
TOK_TABLE_OR_COL
h
appelant
TOK_SELEXPR
.
TOK_TABLE_OR_COL
h
called_number
TOK_SELEXPR
.
TOK_TABLE_OR_COL
h
call_dur
TOK_SELEXPR
.
TOK_TABLE_OR_COL
h
loc_number
TOK_SELEXPR
.
TOK_TABLE_OR_COL
h
h_appel
TOK_LIMIT
2

STAGE DEPENDENCIES:
Stage-0 is a root stage [MAPRED]

STAGE PLANS:
Stage: Stage-0
Map Reduce
Map Operator Tree:
TableScan
alias: h
GatherStats: false
Select Operator
expressions: null (type: string), appelant (type: int), called_number (type: int), call_dur (type: int), loc_number (type: string), h_appel (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Limit
Number of rows: 2
Reduce Output Operator
sort order:
tag: -1
value expressions: _col0 (type: void), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: string), _col5 (type: string)
Path -> Alias:
hdfs://master:8020/user/hive/warehouse/hive_cdr [h]
Path -> Partition:
hdfs://master:8020/user/hive/warehouse/hive_cdr
Partition
base file name: hive_cdr
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns traffic_type_id,appelant,called_number,call_dur,loc_number,h_appel
columns.comments
columns.types int:int:int:int:string:string
field.delim ;
file.inputformat org.apache.hadoop.mapred.TextInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
location hdfs://master:8020/user/hive/warehouse/hive_cdr
name default.hive_cdr
numFiles 1
numRows 0
rawDataSize 0
serialization.ddl struct hive_cdr { i32 traffic_type_id, i32 appelant, i32 called_number, i32 call_dur, string loc_number, string h_appel}
serialization.format ;
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 56373362
transient_lastDdlTime 1425459002
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

          input format: org.apache.hadoop.mapred.TextInputFormat
          output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          properties:
            COLUMN_STATS_ACCURATE true
            bucket_count -1
            columns traffic_type_id,appelant,called_number,call_dur,loc_number,h_appel
            columns.comments 
            columns.types int:int:int:int:string:string
            field.delim ;
            file.inputformat org.apache.hadoop.mapred.TextInputFormat
            file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
            location hdfs://master:8020/user/hive/warehouse/hive_cdr
            name default.hive_cdr
            numFiles 1
            numRows 0
            rawDataSize 0
            serialization.ddl struct hive_cdr { i32 traffic_type_id, i32 appelant, i32 called_number, i32 call_dur, string loc_number, string h_appel}
            serialization.format ;
            serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
            totalSize 56373362
            transient_lastDdlTime 1425459002
          serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          name: default.hive_cdr
        name: default.hive_cdr
  Truncated Path -> Alias:
    /hive_cdr [h]
  Needs Tagging: false
  Reduce Operator Tree:
    Extract
      Limit
        Number of rows: 2
        Select Operator
          expressions: UDFToLong(_col0) (type: bigint), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: string), _col5 (type: string)
          outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
          File Output Operator
            compressed: false
            GlobalTableId: 1
            directory: hdfs://master:8020/user/hive/warehouse/hive_es_cdr_10
            NumFilesPerFileSink: 1
            Stats Publishing Key Prefix: hdfs://master:8020/user/hive/warehouse/hive_es_cdr_10/
            table:
                input format: org.elasticsearch.hadoop.hive.EsHiveInputFormat
                jobProperties:
                  EXTERNAL TRUE
                  bucket_count -1
                  columns id_traffic,caller,called,call_dur,caller_location,call_date
                  columns.comments 
                  columns.types bigint:int:int:int:string:string
                  es.nodes 10.44.162.169
                  es.port 9200
                  es.resource myindex/mytype
                  file.inputformat org.apache.hadoop.mapred.SequenceFileInputFormat
                  file.outputformat org.apache.hadoop.mapred.SequenceFileOutputFormat
                  location hdfs://master:8020/user/hive/warehouse/hive_es_cdr_10
                  name default.hive_es_cdr_10
                  serialization.ddl struct hive_es_cdr_10 { i64 id_traffic, i32 caller, i32 called, i32 call_dur, string caller_location, string call_date}
                  serialization.format 1
                  serialization.lib org.elasticsearch.hadoop.hive.EsSerDe
                  storage_handler org.elasticsearch.hadoop.hive.EsStorageHandler
                  transient_lastDdlTime 1425561441
                output format: org.elasticsearch.hadoop.hive.EsHiveOutputFormat
                properties:
                  EXTERNAL TRUE
                  bucket_count -1
                  columns id_traffic,caller,called,call_dur,caller_location,call_date
                  columns.comments 
                  columns.types bigint:int:int:int:string:string
                  es.nodes 10.44.162.169
                  es.port 9200
                  es.resource myindex/mytype
                  file.inputformat org.apache.hadoop.mapred.SequenceFileInputFormat
                  file.outputformat org.apache.hadoop.mapred.SequenceFileOutputFormat
                  location hdfs://master:8020/user/hive/warehouse/hive_es_cdr_10
                  name default.hive_es_cdr_10
                  serialization.ddl struct hive_es_cdr_10 { i64 id_traffic, i32 caller, i32 called, i32 call_dur, string caller_location, string call_date}
                  serialization.format 1
                  serialization.lib org.elasticsearch.hadoop.hive.EsSerDe
                  storage_handler org.elasticsearch.hadoop.hive.EsStorageHandler
                  transient_lastDdlTime 1425561441
                serde: org.elasticsearch.hadoop.hive.EsSerDe
                name: default.hive_es_cdr_10
            TotalFiles: 1
            GatherStats: false
            MultiFileSpray: false

15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=compile start=1425562594378 end=1425562594484 duration=106 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/03/05 14:36:34 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=master:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@70e69669
15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=acquireReadWriteLocks start=1425562594502 end=1425562594523 duration=21 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO ql.Driver: Starting command: INSERT OVERWRITE TABLE hive_es_cdr_10
SELECT NULL,h.appelant,h.called_number,h.call_dur,h.loc_number,h.h_appel FROM hive_cdr h limit 2
15/03/05 14:36:34 INFO ql.Driver: Total jobs = 1
15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=TimeToSubmit start=1425562594500 end=1425562594526 duration=26 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO ql.Driver: Launching Job 1 out of 1
15/03/05 14:36:34 INFO exec.Task: Number of reduce tasks determined at compile time: 1
15/03/05 14:36:34 INFO exec.Task: In order to change the average load for a reducer (in bytes):
15/03/05 14:36:34 INFO exec.Task: set hive.exec.reducers.bytes.per.reducer=
15/03/05 14:36:34 INFO exec.Task: In order to limit the maximum number of reducers:
15/03/05 14:36:34 INFO exec.Task: set hive.exec.reducers.max=
15/03/05 14:36:34 INFO exec.Task: In order to set a constant number of reducers:
15/03/05 14:36:34 INFO exec.Task: set mapreduce.job.reduces=
15/03/05 14:36:34 INFO ql.Context: New scratch dir is hdfs://master:8020/tmp/hive-hive/hive_2015-03-05_14-36-34_378_4527939627221909415-7
15/03/05 14:36:34 INFO mr.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
15/03/05 14:36:34 INFO mr.ExecDriver: adding libjars: file:///tmp/d39b23a8-98d2-4bc3-9008-3eff080dd20c_resources/hive-serdes-1.0-SNAPSHOT.jar,file:///usr/elasticsearch-hadoop-2.0.2/dist/elasticsearch-hadoop-hive-2.0.2.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.3.1.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/hbase-hadoop-compat.jar
15/03/05 14:36:34 INFO exec.Utilities: Processing alias h
15/03/05 14:36:34 INFO exec.Utilities: Adding input file hdfs://master:8020/user/hive/warehouse/hive_cdr
15/03/05 14:36:34 INFO exec.Utilities: Content Summary not cached for hdfs://master:8020/user/hive/warehouse/hive_cdr
15/03/05 14:36:34 INFO ql.Context: New scratch dir is hdfs://master:8020/tmp/hive-hive/hive_2015-03-05_14-36-34_378_4527939627221909415-7
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO exec.Utilities: Serializing MapWork via kryo
15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=serializePlan start=1425562594554 end=1425562594638 duration=84 from=org.apache.hadoop.hive.ql.exec.Utilities>
15/03/05 14:36:34 INFO log.PerfLogger:
15/03/05 14:36:34 INFO exec.Utilities: Serializing ReduceWork via kryo
15/03/05 14:36:34 INFO log.PerfLogger: </PERFLOG method=serializePlan start=1425562594653 end=1425562594708 duration=55 from=org.apache.hadoop.hive.ql.exec.Utilities>
15/03/05 14:36:34 INFO client.RMProxy: Connecting to ResourceManager at master/10.44.162.169:8032
15/03/05 14:36:34 INFO client.RMProxy: Connecting to ResourceManager at master/10.44.162.169:8032
15/03/05 14:36:34 WARN mr.EsOutputFormat: Speculative execution enabled for reducer - consider disabling it to prevent data corruption
15/03/05 14:36:34 INFO mr.EsOutputFormat: Writing to [myindex/mytype]
15/03/05 14:36:34 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/03/05 14:36:35 INFO log.PerfLogger:
15/03/05 14:36:35 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://master:8020/user/hive/warehouse/hive_cdr; using filter path hdfs://master:8020/user/hive/warehouse/hive_cdr
15/03/05 14:36:35 INFO input.FileInputFormat: Total input paths to process : 1
15/03/05 14:36:35 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 3, size left: 0
15/03/05 14:36:35 INFO io.CombineHiveInputFormat: number of splits 1
15/03/05 14:36:35 INFO log.PerfLogger: </PERFLOG method=getSplits start=1425562595867 end=1425562595896 duration=29 from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
15/03/05 14:36:35 INFO mapreduce.JobSubmitter: number of splits:1
15/03/05 14:36:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1425457357655_0006
15/03/05 14:36:36 INFO impl.YarnClientImpl: Submitted application application_1425457357655_0006
15/03/05 14:36:36 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1425457357655_0006/
15/03/05 14:36:36 INFO exec.Task: Starting Job = job_1425457357655_0006, Tracking URL = http://master:8088/proxy/application_1425457357655_0006/
15/03/05 14:36:36 INFO exec.Task: Kill Command = /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hadoop/bin/hadoop job -kill job_1425457357655_0006
15/03/05 14:36:58 INFO exec.Task: Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
15/03/05 14:36:58 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
15/03/05 14:36:58 INFO exec.Task: 2015-03-05 14:36:58,687 Stage-0 map = 0%, reduce = 0%
15/03/05 14:36:58 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
15/03/05 14:36:58 ERROR exec.Task: Ended Job = job_1425457357655_0006 with errors
15/03/05 14:36:58 INFO impl.YarnClientImpl: Killed application application_1425457357655_0006
15/03/05 14:36:58 ERROR ql.Driver: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
15/03/05 14:36:58 INFO log.PerfLogger: </PERFLOG method=Driver.execute start=1425562594523 end=1425562618754 duration=24231 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:58 INFO ql.Driver: MapReduce Jobs Launched:
15/03/05 14:36:58 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
15/03/05 14:36:58 INFO ql.Driver: Stage-Stage-0: HDFS Read: 0 HDFS Write: 0 FAIL
15/03/05 14:36:58 INFO ql.Driver: Total MapReduce CPU Time Spent: 0 msec
15/03/05 14:36:58 INFO log.PerfLogger:
15/03/05 14:36:58 INFO ZooKeeperHiveLockManager: about to release lock for default/hive_es_cdr_10
15/03/05 14:36:58 INFO ZooKeeperHiveLockManager: about to release lock for default/hive_cdr
15/03/05 14:36:58 INFO ZooKeeperHiveLockManager: about to release lock for default
15/03/05 14:36:58 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1425562618768 end=1425562618780 duration=12 from=org.apache.hadoop.hive.ql.Driver>
15/03/05 14:36:58 ERROR operation.Operation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:147)
at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6315d306-d99e-4d66-8176-c610cc8868cb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I am also trying to load data from hadoop to elasticsearch and also having some issues. I am facing different issue than yours.

In my case I have hadoop cluster and edgenode running on different machines than my elasticsearch server. First I confirmed that network communication is working between hadoop edge node and elasticsearch server. I have uploaded elasticsearch-hadoop-2.0.2.jar into my users directory on the edge node. And I also have a hive table present (shown below) that I want to upload into elasticsearch server running on a different machine.

hive> show tables dw_priority;
OK
tab_name
Time taken: 0.064 seconds
hive>

Then I added elasticsearch-hadoop-2.0.2.jar to the hive path as below:
hive> ADD Jars ./elasticsearch-hadoop-2.0.2.jar;
Added ./elasticsearch-hadoop-2.0.2.jar to class path
Added resource: ./elasticsearch-hadoop-2.0.2.jar
hive>

After that, when I try to create an external table into hive I get bunch of errors.
here is my external table definition followed by errors.
hive> CREATE EXTERNAL TABLE remedytst_es ( dw_incident_id double, incident_number string)
> ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> TBLPROPERTIES (
> 'es.nodes’=’64.102.212.139’ ,
> 'es.resource’=‘itsm/incident’) ;
NoViableAltException(286@[])
at org.apache.hadoop.hive.ql.parse.HiveParser.tablePropertiesList(HiveParser.java:23403)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableProperties(HiveParser.java:23292)
at org.apache.hadoop.hive.ql.parse.HiveParser.tablePropertiesPrefixed(HiveParser.java:23227)
at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:4460)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2016)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1298)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:938)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:977)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
FAILED: ParseException line 5:0 cannot recognize input near ''es.nodes’=’64.102.212.139’ ,\n'' 'es' '.' in table properties list

Looking at the errors, I think hive is not able to recognize the elasticsearch-hadoop-2.0.2.jar.

Did you also face the similar issue?