Failed to running hive job with CDH 5.1.2 and ES-Hadoop 2.0.0

Hello list,

I have 4 node ES cluster and 6 node CDH running in the lab.

The Hive job is as below:

========hive job===============
CREATE TABLE logs (type STRING, time STRING, ext STRING, ip STRING, req
STRING, res INT, bytes INT, phpmem INT, agent STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA INPATH '/user/hue/apache/apache.log' OVERWRITE INTO TABLE logs;

CREATE EXTERNAL TABLE eslogs (time STRING, extension STRING, clientip
STRING, request STRING, response INT, agent STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'logstash-2014.08.06/hive',
'es.mapping.names' = 'time:@timestamp',
'es.nodes' = 'es');

INSERT OVERWRITE TABLE eslogs SELECT s.time, s.ext, s.ip, s.req, s.res,
s.agent FROM logs s;

The error message is as below :

2014-09-16 15:31:20,127 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
report from attempt_1410833918846_0007_m_000000_3: Error:
java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
{"type":"apache","time":"2013-10-09T14:04:32Z","ext":"php","ip":"129.124.201.110","req":"/EKEE.php","res":200,"bytes":1970,"phpmem":5910,"agent":"Mozilla/5.0
(X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50
Safari/534.24"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row
{"type":"apache","time":"2013-10-09T14:04:32Z","ext":"php","ip":"129.124.201.110","req":"/EKEE.php","res":200,"bytes":1970,"phpmem":5910,"agent":"Mozilla/5.0
(X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50
Safari/534.24"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
... 8 more
Caused by: java.lang.NullPointerException
at org.elasticsearch.hadoop.serialization.dto.Node.(Node.java:33)
at org.elasticsearch.hadoop.rest.RestClient.getNodes(RestClient.java:245)
at
org.elasticsearch.hadoop.rest.RestRepository.getWriteTargetPrimaryShards(RestRepository.java:240)
at
org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.initSingleIndex(EsOutputFormat.java:218)
at
org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:201)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:58)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:638)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:519)
... 9 more

2014-09-16 15:31:20,129 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1410833918846_0007_m_000000_3 TaskAttempt Transitioned from RUNNING
to FAIL_CONTAINER_CLEANUP
2014-09-16 15:31:20,130 INFO [ContainerLauncher #7]
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl:
Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container
container_1410833918846_0007_01_000005 taskAttempt
attempt_1410833918846_0007_m_000000_3

Appreciate if anyone can give a clue.

--
jOe

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5A-6QzqdoDytQ_E7g%3Dtgag39i%3DheMjF%3D8RfheDtuQfY_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

Upgrade to es-hadoop 2.0.1.
The error is caused by the fact that you have nodes within the ES cluster without a HTTP/REST point. These are now
properly excluded though note, it means they will not be used by es-hadoop.
As an alternative, consider enabling HTTP on all your data nodes.

On 9/16/14 1:58 PM, Joe,Yu wrote:

Hello list,

I have 4 node ES cluster and 6 node CDH running in the lab.

The Hive job is as below:

========hive job===============
CREATE TABLE logs (type STRING, time STRING, ext STRING, ip STRING, req STRING, res INT, bytes INT, phpmem INT, agent
STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
LOAD DATA INPATH '/user/hue/apache/apache.log' OVERWRITE INTO TABLE logs;

CREATE EXTERNAL TABLE eslogs (time STRING, extension STRING, clientip STRING, request STRING, response INT, agent STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'logstash-2014.08.06/hive',
'es.mapping.names' = 'time:@timestamp',
'es.nodes' = 'es');

INSERT OVERWRITE TABLE eslogs SELECT s.time, s.ext, s.ip, s.req, s.res, s.agent FROM logs s;

The error message is as below :

2014-09-16 15:31:20,127 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from
attempt_1410833918846_0007_m_000000_3:Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
{"type":"apache","time":"2013-10-09T14:04:32Z","ext":"php","ip":"129.124.201.110","req":"/EKEE.php","res":200,"bytes":1970,"phpmem":5910,"agent":"Mozilla/5.0
(X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
{"type":"apache","time":"2013-10-09T14:04:32Z","ext":"php","ip":"129.124.201.110","req":"/EKEE.php","res":200,"bytes":1970,"phpmem":5910,"agent":"Mozilla/5.0
(X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
... 8 more
Caused by: java.lang.NullPointerException
at org.elasticsearch.hadoop.serialization.dto.Node.(Node.java:33)
at org.elasticsearch.hadoop.rest.RestClient.getNodes(RestClient.java:245)
at org.elasticsearch.hadoop.rest.RestRepository.getWriteTargetPrimaryShards(RestRepository.java:240)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.initSingleIndex(EsOutputFormat.java:218)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:201)
at org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:58)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:638)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:847)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:519)
... 9 more

2014-09-16 15:31:20,129 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1410833918846_0007_m_000000_3 TaskAttempt
Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP
2014-09-16 15:31:20,130 INFO [ContainerLauncher #7] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl:
Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1410833918846_0007_01_000005
taskAttempt attempt_1410833918846_0007_m_000000_3

Appreciate if anyone can give a clue.

--
jOe

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5A-6QzqdoDytQ_E7g%3Dtgag39i%3DheMjF%3D8RfheDtuQfY_w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5A-6QzqdoDytQ_E7g%3Dtgag39i%3DheMjF%3D8RfheDtuQfY_w%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5418189C.1060107%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

On Tue, Sep 16, 2014 at 7:01 PM, Costin Leau costin.leau@gmail.com wrote:

Hi,

Upgrade to es-hadoop 2.0.1.
The error is caused by the fact that you have nodes within the ES cluster
without a HTTP/REST point. These are now properly excluded though note, it
means they will not be used by es-hadoop.
As an alternative, consider enabling HTTP on all your data nodes.

Thanks a lot ! Could you tell me where to find this bug and fix
description?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5CkFMM0cff91K089%3Dpp5Dxui%3DJxDRRDXtvVL8RNPcTfLA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

In Gibhub under issues [1] or in the release notes for the 2.0.1 release. Most likely, you are facing issue #210.

[1] Issues · elastic/elasticsearch-hadoop · GitHub

On 9/16/14 4:52 PM, Joe,Yu wrote:

On Tue, Sep 16, 2014 at 7:01 PM, Costin Leau <costin.leau@gmail.com mailto:costin.leau@gmail.com> wrote:

Hi,

Upgrade to es-hadoop 2.0.1.
The error is caused by the fact that you have nodes within the ES cluster without a HTTP/REST point. These are now
properly excluded though note, it means they will not be used by es-hadoop.
As an alternative, consider enabling HTTP on all your data nodes.

Thanks a lot ! Could you tell me where to find this bug and fix description?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5CkFMM0cff91K089%3Dpp5Dxui%3DJxDRRDXtvVL8RNPcTfLA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2BXGj5CkFMM0cff91K089%3Dpp5Dxui%3DJxDRRDXtvVL8RNPcTfLA%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/541846AA.1060203%40gmail.com.
For more options, visit https://groups.google.com/d/optout.