Unable to query select count(1) <table> fails in hive


(Mohankumar K H) #1

OS: : CENTOS 6.5
JVM : java version "1.7.0_67"

Hadoop/Spark: CDH 5.5.x hive-common-1.1.0-cdh5.5.1.jar
ES-Hadoop : elasticsearch-hadoop-2.3.2.jar
ES : elasticsearch-2.3.3-1.noarch

Query works are "Select * from

works fine.

below query fails

select count(1) from msbilogs;

0: jdbc:hive2://hdfc02gw01.amr.corp.intel.com> select count(1) from msbilogs; INFO : Number of reduce tasks determined at compile time: 1 INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=
INFO : Cleaning up the staging area /user/hive/.staging/job_1465210153149_0065
ERROR : Job Submission failed with exception 'java.lang.NullPointerException(null)'
java.lang.NullPointerException
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:347)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
at org.elasticsearch.hadoop.mr.EsInputFormat$ShardInputSplit.write(EsInputFormat.java:108)
at org.elasticsearch.hadoop.hive.EsHiveInputFormat$EsHiveSplit.write(EsHiveInputFormat.java:77)
at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.write(HiveInputFormat.java:177)
at org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:164)
at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:92)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:353)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:323)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:199)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:564)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:559)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:559)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:550)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:429)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1636)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1396)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=1)


(Jimmy Kuang) #2

I'm using the latest ES-hadoop connector and ES , this is working for me.

hive> select count(1) from test7;
Query ID = hive_20170331032150_b3e14c51-bb98-4b05-bce9-93c978cd02d9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1490918310964_0004, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1490918310964_0004/
Kill Command = /usr/hdp/2.5.0.0-1245/hadoop/bin/hadoop job  -kill job_1490918310964_0004
Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1
2017-03-31 03:22:02,559 Stage-1 map = 0%,  reduce = 0%
2017-03-31 03:22:28,636 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU 2.6 sec
2017-03-31 03:22:32,729 Stage-1 map = 40%,  reduce = 0%, Cumulative CPU 4.66 sec
2017-03-31 03:22:33,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 12.7 sec
2017-03-31 03:22:38,146 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 14.37 sec
MapReduce Total cumulative CPU time: 14 seconds 370 msec
Ended Job = job_1490918310964_0004
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 5  Reduce: 1   Cumulative CPU: 14.37 sec   HDFS Read: 423094 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 14 seconds 370 msec
OK
2
Time taken: 50.041 seconds, Fetched: 1 row(s)

Can you run "desc formatted msbilogs"?


(system) #3