[elasticsearch-hadoop problem] run hive script for two table join. some error happened. Two tables also stored in elasticsearch

Dear all:

    Recently I come across a strange problem. I want to use the 

elasticsearch-1.0.0 as a backend storage for hive. I use the
elasticsearch-hadoop-1.3.0.M2 to create hive tables on elasticsearch. The
hive sql are as followings:

create external table supplier_es (S_SUPPKEY BIGINT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY BIGINT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) stored by
'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='q9/supplier','es.index.auto.create'='true','es.nodes'
= 'localhost:9200');

create external table nation_es (N_NATIONKEY BIGINT, N_NAME STRING,
N_REGIONKEY BIGINT, N_COMMENT STRING) stored by
'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='q9/nation','es.index.auto.create'='true','es.nodes'
= 'localhost:9200');

The table join operation is as followings:

select s_suppkey, n_name from supplier_es s join nation_es n on
n.n_nationkey = s.s_nationkey;

the error messages( I get from the log file):

2014-03-19 14:21:03,254 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: map.input.file is
deprecated. Instead, use mapreduce.map.input.file
2014-03-19 14:21:03,254 INFO [main]
org.apache.hadoop.hive.ql.exec.MapOperator:
fpath:hdfs://server-220.novalocal:8020/user/hive/warehouse/nation_es
2014-03-19 14:21:03,268 INFO [main]
org.apache.hadoop.hive.ql.exec.MapOperator: getPathToAliases

2014-03-19 14:21:03,268 INFO [main]
org.apache.hadoop.hive.ql.exec.MapOperator: key :
hdfs://server-220.novalocal:8020/user/hive/warehouse/supplier_es
2014-03-19 14:21:03,268 INFO [main]
org.apache.hadoop.hive.ql.exec.MapOperator: value[s]
2014-03-19 14:21:03,268 INFO [main]
org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias s to work list for
file hdfs://server-220.novalocal:8020/user/hive/warehouse/supplier_es
2014-03-19 14:21:03,270 ERROR [main]
org.apache.hadoop.hive.ql.exec.MapOperator: Configuration does not have any
alias for path:
hdfs://server-220.novalocal:8020/user/hive/warehouse/nation_es
2014-03-19 14:21:03,285 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.lang.RuntimeException: Error in configuring
object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at
org.apache.hadoop.mapred.MapTask.runOldMapper_aroundBody2(MapTask.java:434)
at org.apache.hadoop.mapred.MapTask$AjcClosure3.run(MapTask.java:1)
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149)
at
com.intel.bigdata.management.agent.HadoopTaskAspect.doPhaseCall(HadoopTaskAspect.java:166)
at
com.intel.bigdata.management.agent.HadoopTaskAspect.ajc$inlineAccessMethod$com_intel_bigdata_management_agent_HadoopTaskAspect$com_intel_bigdata_management_agent_HadoopTaskAspect$doPhaseCall(HadoopTaskAspect.java:1)
at
com.intel.bigdata.management.agent.HadoopTaskAspect.aroundMap(HadoopTaskAspect.java:38)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:411)
at org.apache.hadoop.mapred.MapTask.run_aroundBody0(MapTask.java:343)
at org.apache.hadoop.mapred.MapTask$AjcClosure1.run(MapTask.java:1)
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149)
at
com.intel.bigdata.management.agent.HadoopTaskAspect.aroundTaskRun(HadoopTaskAspect.java:95)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1531)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 19 more
Caused by: java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 24 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 27 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:142)
... 32 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input
path are inconsistent
at
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:419)
at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:110)
... 32 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration
and input path are inconsistent
at
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:413)
... 33 more

I have try to figure out the problem, but I can't find out the reason. I
ask anyone for help. Thanks very much.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4bcaef49-7483-4777-a8a2-6af671a082af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

The issue might be caused by the fact that M2 doesn't support different input and output indices for the same job; that
is to use ES both as input and output within the same job (which is essentially what you are doing with the select).
This has been fixed in master - can you try the latest nightly build or potentially build master yourself?

Cheers,

On 3/19/14 9:01 AM, 沈国权 wrote:

Dear all:

     Recently I come across a strange problem. I want to use the elasticsearch-1.0.0 as a backend storage for hive.

I use the elasticsearch-hadoop-1.3.0.M2 to create hive tables on elasticsearch. The hive sql are as followings:

create external table supplier_es (S_SUPPKEY BIGINT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY BIGINT, S_PHONE
STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='q9/supplier','es.index.auto.create'='true','es.nodes' = 'localhost:9200');

create external table nation_es (N_NATIONKEY BIGINT, N_NAME STRING, N_REGIONKEY BIGINT, N_COMMENT STRING) stored by
'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='q9/nation','es.index.auto.create'='true','es.nodes' = 'localhost:9200');

The table join operation is as followings:

select s_suppkey, n_name from supplier_es s join nation_es n on n.n_nationkey = s.s_nationkey;

the error messages( I get from the log file):

2014-03-19 14:21:03,254 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated.
Instead, use mapreduce.map.input.file
2014-03-19 14:21:03,254 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator:
fpath:hdfs://server-220.novalocal:8020/user/hive/warehouse/nation_es
2014-03-19 14:21:03,268 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: getPathToAliases

2014-03-19 14:21:03,268 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: key :
hdfs://server-220.novalocal:8020/user/hive/warehouse/supplier_es
2014-03-19 14:21:03,268 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: value[s]
2014-03-19 14:21:03,268 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias s to work list for file
hdfs://server-220.novalocal:8020/user/hive/warehouse/supplier_es
2014-03-19 14:21:03,270 ERROR [main] org.apache.hadoop.hive.ql.exec.MapOperator: Configuration does not have any alias
for path: hdfs://server-220.novalocal:8020/user/hive/warehouse/nation_es
2014-03-19 14:21:03,285 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child :
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper_aroundBody2(MapTask.java:434)
at org.apache.hadoop.mapred.MapTask$AjcClosure3.run(MapTask.java:1)
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149)
at com.intel.bigdata.management.agent.HadoopTaskAspect.doPhaseCall(HadoopTaskAspect.java:166)
at
com.intel.bigdata.management.agent.HadoopTaskAspect.ajc$inlineAccessMethod$com_intel_bigdata_management_agent_HadoopTaskAspect$com_intel_bigdata_management_agent_HadoopTaskAspect$doPhaseCall(HadoopTaskAspect.java:1)
at com.intel.bigdata.management.agent.HadoopTaskAspect.aroundMap(HadoopTaskAspect.java:38)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:411)
at org.apache.hadoop.mapred.MapTask.run_aroundBody0(MapTask.java:343)
at org.apache.hadoop.mapred.MapTask$AjcClosure1.run(MapTask.java:1)
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149)
at com.intel.bigdata.management.agent.HadoopTaskAspect.aroundTaskRun(HadoopTaskAspect.java:95)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1531)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 19 more
Caused by: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 24 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 27 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:142)
... 32 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
Configuration and input path are inconsistent
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:419)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:110)
... 32 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path are inconsistent
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:413)
... 33 more

I have try to figure out the problem, but I can't find out the reason. I ask anyone for help. Thanks very much.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4bcaef49-7483-4777-a8a2-6af671a082af%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4bcaef49-7483-4777-a8a2-6af671a082af%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/532950E0.8040607%40gmail.com.
For more options, visit https://groups.google.com/d/optout.