ES 2.0.0 + ES-Hadoop 2.1.1 | 2.2.0-m1 usecase & issues


#1

Hello,

0/ HEAD vs HQ
Always the same problem with HEAD plugin (connect button -> error "_status"), anyway I use now HQ and everything is find.

1/ node.mater, node.data vs node.name
In previous release (1.7.X) I use the properties node.master and node.data to tune every nodes.
In the 2.0.0. there's only a node.name property. HQ show the nodes (Screenshot 04) :
This configuration is very important to tune multi node Architecture.
How data node are configured in yml file ?

2/ hadoop + pig + elastic
I use elasticsearch-hadoop.
Yesterday Architecture : pig-0.15.0, ES 1.7.2 (2 nodes), elasticsearch-hadoop-2.1.1, Kibana 4.1.2.

...
REGISTER /jbigdata/elasticsearch-hadoop-2.1.1/dist/elasticsearch-hadoop-pig-2.1.1.jar;
-- REGISTER /jbigdata/elasticsearch-hadoop-2.2.0-m1/dist/elasticsearch-hadoop-pig-2.2.0-m1.jar;
...

Everything works fine : Pig latin script, ES index creation, Kibana dashboards.

Nowdays Architecture : pig-0.15.0, ES 2.0.0 (2 nodes), elasticsearch-hadoop-2.1.1 (elasticsearch-hadoop-pig-2.2.0-m1, same issue), Kibana 4.X (not yet used).

When I use the org.elasticsearch.hadoop.pig.EsStorage(...) (works fine with ES 1.7.2), it fails:
The hadoop JobHistory (Screenshot 03) shows the issue:
2 jobs OK with ES 1.7.2, the same PIG script, 2 jobs KO with ES 2.0.0, stacktrace:

2015-10-29 09:20:15,296 INFO [IPC Server handler 0 on 48708] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1445412791627_0003_m_000003 given task: attempt_1445412791627_0003_m_000000_0
2015-10-29 09:20:18,757 INFO [IPC Server handler 1 on 48708] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1445412791627_0003_m_000000_0 is : 0.0
2015-10-29 09:20:18,778 FATAL [IPC Server handler 12 on 48708] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1445412791627_0003_m_000000_0 - exited : java.lang.StringIndexOutOfBoundsException: String index out of range: -29
	at java.lang.String.substring(String.java:1911)
	at org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:110)
	at org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:58)
	at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:374)
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149)
	at org.elasticsearch.hadoop.pig.EsStorage.putNext(EsStorage.java:192)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:281)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Any Idea ?


(Mark Walkom) #2

It's probably easier if you create a new thread for your Hadoop issues in the Hadoop forum.


(Mark Walkom) #3

Not sure why you think this - https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html?q=node.master


#4

OK, I keep on investiguate.
This OpenSource software is cool.


(system) #5