Writing to dynamic/multi-resources not working with Pig and ES-Hadoop 2.2


#1

Hello,

I'm trying ES-Hadoop to integrate it in our process. I'm using ES 2.1, Pig 0.14 and the version 2.2.0-rc1 of ES-Hadoop.
I need to write on multiple indexes/types at the time.

I'm reading a file with the current line :

{"partitionId":15,"siteId":"br2ryqd66","visitorId":"0001525cf5423a334e3df","visitId":"00015bbf7a52c4cbba536","eventId":"eawe38cukbpqfmuaursoszqjnly819fs","ts":"2016-01-07T23:54:24.824Z","eventType":"visit","eventName":"visit_closed","eventLive":1,"visit":{},"partner":{},"visit_closed":{},"meta":{"type":"event", "index":"v00000262"}}

I'm trying to store it on ES with Pig using EsStorage.
The command

STORE A INTO 'v00000262/event' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true','es.http.timeout = 5m', 'es.index.auto.create = false', 'es.mapping.id=eventId', 'es.mapping.timestamp=ts', 'es.mapping.parent=visitorId', 'es.mapping.exclude=meta','es.nodes=$es_url'); 

works perfectly but the command

STORE A INTO 'v00000262/{meta.type}' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true','es.http.timeout = 5m', 'es.index.auto.create = false', 'es.mapping.id=eventId', 'es.mapping.timestamp=ts', 'es.mapping.parent=visitorId', 'es.mapping.exclude=meta','es.nodes=$es_url');

returns the error Invalid target URI HEAD@null/v00000262/{meta.type}

In the log, I have (I've deleted the ES IP but it's the right one) :

================================================================================
Pig Stack Trace

ERROR 1002: Unable to store alias A

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias A
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1694)
at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[...:9200]]
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:383)
at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:391)
at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:467)
at org.elasticsearch.hadoop.rest.RestRepository.indexExists(RestRepository.java:449)
at org.elasticsearch.hadoop.rest.InitializationUtils.checkIndexExistence(InitializationUtils.java:203)
at org.elasticsearch.hadoop.mr.EsOutputFormat.init(EsOutputFormat.java:263)
at org.elasticsearch.hadoop.mr.EsOutputFormat.checkOutputSpecs(EsOutputFormat.java:233)
at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
at org.apache.pig.PigServer.execute(PigServer.java:1356)
at org.apache.pig.PigServer.access$500(PigServer.java:113)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689)
... 14 more

I don't see where I've done a mistake...
Is it a bug or I've forgotten something ?

Thanks in advance


(Costin Leau) #2

Can you turn on logging? Your example looks fine and should work - here's an example I just tried:

Doc with json lines such as:
{"number":"1","name":"Buckethead","url":"Bucketheadland.com","meta":{"type":"awesome"}}

and the following script:

A = LOAD artists.dat USING PigStorage() AS (json: chararray);
STORE A INTO 'json-pig/nestedpattern-{meta.type}' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true');";

yields the expected result.

P.S. Note that with json input, one cannot select or exclude fields - these options apply only for JSON that is generated by ES from Pig tables.


#3

Thanks for your help.

Here is my full code :

REGISTER ./elasticsearch-hadoop.jar

A = LOAD './test.json' USING TextLoader() as (json: chararray);
STORE A INTO 'v00000262/{meta.type}' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true','es.http.timeout = 5m', 'es.index.auto.create = false', 'es.mapping.id=eventId', 'es.mapping.timestamp=ts', 'es.mapping.parent=visitorId', 'es.mapping.exclude=meta','es.nodes=$es_url');

With DEBUG enabled, I get the current logs in PIG Shell (Pig is in local mode, don't know if it matters) :

16/01/28 09:44:29 DEBUG pig.EsStorage: Elasticsearch input marked as JSON; bypassing serialization through [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] instead of [class org.elasticsearch.hadoop.pig.PigValueWriter]
16/01/28 09:44:29 DEBUG pig.EsStorage: Using pre-defined writer serializer [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] as default
16/01/28 09:44:29 DEBUG pig.EsStorage: Using pre-defined reader serializer [org.elasticsearch.hadoop.pig.PigValueReader] as default
16/01/28 09:44:29 DEBUG pig.EsStorage: JSON input specified; using pre-defined bytes/json converter [org.elasticsearch.hadoop.pig.PigBytesConverter] as default
16/01/28 09:44:29 DEBUG pig.EsStorage: Using pre-defined field extractor [org.elasticsearch.hadoop.pig.PigFieldExtractor] as default
16/01/28 09:44:29 ERROR rest.NetworkClient: Node [...:9200] failed (Invalid target URI HEAD@null/v00000262/{meta.type}); no other nodes left - aborting...
3628 [main] ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 1002: Unable to store alias A
16/01/28 09:44:29 ERROR grunt.Grunt: ERROR 1002: Unable to store alias A
Details at logfile: /shared/clement/pig_1453974266088.log

And the file /shared/clement/pig_1453974266088.log contains :

Pig Stack Trace
---------------
ERROR 1002: Unable to store alias A

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias A
        at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1694)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
        at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
        at org.apache.pig.Main.run(Main.java:558)
        at org.apache.pig.Main.main(Main.java:170)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[...:9200]]
        at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
        at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:383)
        at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:391)
        at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:467)
        at org.elasticsearch.hadoop.rest.RestRepository.indexExists(RestRepository.java:449)
        at org.elasticsearch.hadoop.rest.InitializationUtils.checkIndexExistence(InitializationUtils.java:203)
        at org.elasticsearch.hadoop.mr.EsOutputFormat.init(EsOutputFormat.java:263)
        at org.elasticsearch.hadoop.mr.EsOutputFormat.checkOutputSpecs(EsOutputFormat.java:233)
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69)
        at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
        ... 
================================================================================

(Costin Leau) #4

Can you please try to enable logging on the serialization and REST package as indicated here?

This will provide more information such as whether the connectivity actually works and what request are made to ES.
This is an example from the test suite:

21:18:24,430 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Rx @[127.0.0.1] [200-OK] [{"cluster_name":"ES-HADOOP-TEST","nodes":{"ynQtzsBVRQ-8FQSu1rHzvg":{"name":"Man-Spider","transport_address":"local[1]","host":"local","ip":"0.0.0.0","version":"2.2.0-SNAPSHOT","build":"0682430","http_address":"127.0.0.1:9500","attributes":{"local":"true"},"transport":{"bound_address":["local[1]"],"publish_address":"local[1]","profiles":{}}}}}]
21:18:24,432 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Closing HTTP transport to 127.0.0.1:9500
21:18:24,432 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Opening HTTP transport to 127.0.0.1:9500
21:18:24,432 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Tx [GET]@[127.0.0.1:9500][_nodes/http] w/ payload [null]
21:18:24,436 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Rx @[127.0.0.1] [200-OK] [{"cluster_name":"ES-HADOOP-TEST","nodes":{"ynQtzsBVRQ-8FQSu1rHzvg":{"name":"Man-Spider","transport_address":"local[1]","host":"local","ip":"0.0.0.0","version":"2.2.0-SNAPSHOT","build":"0682430","http_address":"127.0.0.1:9500","attributes":{"local":"true"},"http":{"bound_address":["[::1]:9500","127.0.0.1:9500"],"publish_address":"127.0.0.1:9500","max_content_length_in_bytes":104857600}}}}]
21:18:24,439 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Closing HTTP transport to 127.0.0.1:9500
21:18:24,440  INFO pool-1-thread-1 mr.EsOutputFormat - Writing to [json-pig/nestedpattern-{meta.type}]
21:18:24,445 TRACE pool-1-thread-1 commonshttp.CommonsHttpTransport - Opening HTTP transport to 127.0.0.1:9500
21:18:24,452 DEBUG pool-1-thread-1 bulk.AbstractBulkFactory - JSON input; using internal field extractor for efficient parsing...
21:18:24,457 TRACE pool-1-thread-1 bulk.JsonTemplatedBulk - About to extract information from [{"number":"1","name":"Buckethead","url":"http://bucketheadland.com","meta":{"type":"1"}}]
21:18:24,457 TRACE pool-1-thread-1 field.JsonFieldExtractors - About to look for paths [[meta.type]] in doc ...

#5

Here is the log I get :

16/02/01 08:11:42 DEBUG pig.EsStorage: Elasticsearch input marked as JSON; bypassing serialization through [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] instead of [class org.elasticsearch.hadoop.pig.PigValueWriter]
16/02/01 08:11:42 DEBUG pig.EsStorage: Using pre-defined writer serializer [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] as default
16/02/01 08:11:42 DEBUG pig.EsStorage: Using pre-defined reader serializer [org.elasticsearch.hadoop.pig.PigValueReader] as default
16/02/01 08:11:42 DEBUG pig.EsStorage: JSON input specified; using pre-defined bytes/json converter [org.elasticsearch.hadoop.pig.PigBytesConverter] as default
16/02/01 08:11:42 DEBUG pig.EsStorage: Using pre-defined field extractor [org.elasticsearch.hadoop.pig.PigFieldExtractor] as default
16/02/01 08:11:42 TRACE commonshttp.CommonsHttpTransport: Opening HTTP transport to 10.0.1.155:9200
16/02/01 08:11:42 TRACE rest.NetworkClient: Caught exception while performing request [10.0.1.155:9200][v00000262/{meta.type}] - falling back to the next node in line...
org.elasticsearch.hadoop.rest.EsHadoopTransportException: Invalid target URI HEAD@null/v00000262/{meta.type}
	at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:405)
	at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:104)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:383)
	at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:391)
	at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:467)
	at org.elasticsearch.hadoop.rest.RestRepository.indexExists(RestRepository.java:449)
	at org.elasticsearch.hadoop.rest.InitializationUtils.checkIndexExistence(InitializationUtils.java:203)
	at org.elasticsearch.hadoop.mr.EsOutputFormat.init(EsOutputFormat.java:263)
	at org.elasticsearch.hadoop.mr.EsOutputFormat.checkOutputSpecs(EsOutputFormat.java:233)
	at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69)
	at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
	at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
	at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
	at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
	at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
	at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
	at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
	at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
	at org.apache.pig.PigServer.execute(PigServer.java:1356)
	at org.apache.pig.PigServer.access$500(PigServer.java:113)
	at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689)
	at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
	at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
	at org.apache.pig.Main.run(Main.java:558)
	at org.apache.pig.Main.main(Main.java:170)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.commons.httpclient.URIException: escaped absolute path not valid
	at org.apache.commons.httpclient.URI.setRawPath(URI.java:2837)
	at org.apache.commons.httpclient.URI.parseUriReference(URI.java:2023)
	at org.apache.commons.httpclient.URI.<init>(URI.java:147)
	at org.apache.commons.httpclient.HttpMethodBase.getURI(HttpMethodBase.java:265)
	at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:403)
	... 34 more
16/02/01 08:11:42 ERROR rest.NetworkClient: Node [10.0.1.155:9200] failed (Invalid target URI HEAD@null/v00000262/{meta.type}); no other nodes left - aborting...
102230 [main] ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 1002: Unable to store alias A
16/02/01 08:11:42 ERROR grunt.Grunt: ERROR 1002: Unable to store alias A

Seems it's a bug don't you think ?


#6

I tried again with the release 2.2 but I still have the same error...


(Costin Leau) #7

@cjuste it sure looks like a bug however as I've mentioned above in the logs, the problem is I cannot reproduce it. In fact the test suite already contains a test similar to your example.

Based on your configuration it looks like the $es_url is the only thing that is different - would changing that make a difference?
Also can you confirm you have only one version of ES and that is the latest one?


#8

@costin I've just tried with a fresh install of elasticsearch.

I installed Elastic Search on Ubuntu 14.04 LTS using the .deb package 2.2.0. There's only 1 node.
I've just downloaded ES-Hadoop 2.2.0 (just in case).

Here's my full pig script :

%default es_url '10.0.1.145'

REGISTER ./elasticsearch-hadoop-pig.jar;

A = LOAD './test.json' USING TextLoader() as (json: chararray);
STORE A INTO 'v00000262/{meta.type}' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true','es.http.timeout = 5m', 'es.index.auto.create = false', 'es.mapping.id=eventId', 'es.mapping.timestamp=ts', 'es.mapping.parent=visitorId', 'es.mapping.exclude=meta','es.nodes=$es_url');

Here is the log I get :

16/02/22 16:02:59 DEBUG pig.EsStorage: Elasticsearch input marked as JSON; bypassing serialization through [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] instead of [class org.elasticsearch.hadoop.pig.PigValueWriter]
16/02/22 16:02:59 DEBUG pig.EsStorage: Using pre-defined writer serializer [org.elasticsearch.hadoop.serialization.builder.NoOpValueWriter] as default
16/02/22 16:02:59 DEBUG pig.EsStorage: Using pre-defined reader serializer [org.elasticsearch.hadoop.pig.PigValueReader] as default
16/02/22 16:02:59 DEBUG pig.EsStorage: JSON input specified; using pre-defined bytes/json converter [org.elasticsearch.hadoop.pig.PigBytesConverter] as default
16/02/22 16:02:59 DEBUG pig.EsStorage: Using pre-defined field extractor [org.elasticsearch.hadoop.pig.PigFieldExtractor] as default
16/02/22 16:02:59 TRACE commonshttp.CommonsHttpTransport: Opening HTTP transport to 10.0.1.145:9200
16/02/22 16:02:59 TRACE rest.NetworkClient: Caught exception while performing request [10.0.1.145:9200][v00000262/{meta.type}] - falling back to the next node in line...
org.elasticsearch.hadoop.rest.EsHadoopTransportException: Invalid target URI HEAD@null/v00000262/{meta.type}
	at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:443)
	at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:104)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:423)
	at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:431)
	at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:507)
	at org.elasticsearch.hadoop.rest.RestRepository.indexExists(RestRepository.java:467)
	at org.elasticsearch.hadoop.rest.InitializationUtils.checkIndexExistence(InitializationUtils.java:203)
	at org.elasticsearch.hadoop.mr.EsOutputFormat.init(EsOutputFormat.java:263)
	at org.elasticsearch.hadoop.mr.EsOutputFormat.checkOutputSpecs(EsOutputFormat.java:233)
	at ...
16/02/22 16:02:59 ERROR rest.NetworkClient: Node [10.0.1.145:9200] failed (Invalid target URI HEAD@null/v00000262/{meta.type}); no other nodes left - aborting...
5442 [main] ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 1002: Unable to store alias A
16/02/22 16:02:59 ERROR grunt.Grunt: ERROR 1002: Unable to store alias A

I'm running pig 0.15.0 from HortonWorks on Ubuntu 14.04


(Costin Leau) #9

I'll try to replicate the issue myself. If you replace es_url with the actual value do you see any change in behaviour?


#10

It doesn't change anything, unfortunately...


#11

I have OpenJDK 1.7.0_91 on the pig VM.
May it cause any trouble ?


(Costin Leau) #12

It might - any reason why you are not using Sun/Oracle JDK?


#13

I'm now using Oracle JDK 8. But this hasn't changed anything. But, I've found another thing.
If I change my request, I have a totally different error. I reduced the number of parameters, to only specify json and es.nodes.

STORE A INTO '{index}/{type}' USING org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true','es.nodes=$es_url');

In this case, it connects correctly to ES (I get all the nodes' IP).
And I get the error:

java.lang.Exception: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: id must not be null
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: id must not be null
	at org.elasticsearch.hadoop.rest.RestClient.checkResponse(RestClient.java:467)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:425)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:415)
	at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:145)
	at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:225)
	at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:248)
	at org.elasticsearch.hadoop.rest.RestRepository.close(RestRepository.java:267)
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.doClose(EsOutputFormat.java:214)
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.close(EsOutputFormat.java:196)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:146)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Seems the problems come from my mapping parameters (I've deleted the mapping.exclude but doesn't change anything).


(Costin Leau) #14

Turn on REST logging to see what data is sent to ES and the error being returned.


#15

I know why I got previous error (id must not be null). I've specified routing and parent as required, so it's logical that if I don't specify them, I get an error. But this means I've successfully contacted ES, contrary to previous error (escaped absolute path not valid).

I deduce that the matter comes from combining mapping parameters and changing index/type.

You said

without specifying any mapping. Have you tried with a mapping ?


(system) #16