Error Writing data to ElasticSearch from PIG


(ravimbhatt) #1

Hello All,

I have a very simple pig script.

set default_parallel 8;

REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar';

sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING PigStorage('\t') AS
(itemid:chararray, list:chararray);
a = LIMIT sets 100;
b = foreach a generate itemid;
STORE b INTO 'indexOne/iis' USING
org.elasticsearch.hadoop.pig.EsStorage('es.nodes=ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis');

I get below error in my reducer:

java.lang.NullPointerException
at javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)
at javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)
at javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)
at javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)
at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)
at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
at org.apache.hadoop.mapred.Child.main(Child.java:260)

Am i missing something basic here?

Thanks!

Ravi

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #2

Hi,

You're not - it looks like a bug inside the base64 function within the JDK (note the input is generated by the same
class). Out of curiosity what JDK version are you using?

Cheers,

On 4/24/14 7:35 PM, ravimbhatt@gmail.com wrote:

Hello All,

I have a very simple pig script.

set default_parallel 8;

REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar';

sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING PigStorage('\t') AS (itemid:chararray, list:chararray);
a = LIMIT sets 100;
b = foreach a generate itemid;
STORE b INTO 'indexOne/iis' USING
org.elasticsearch.hadoop.pig.EsStorage('es.nodes=ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis');

I get below error in my reducer:

java.lang.NullPointerException
at javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)
at javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)
at javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)
at javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)
at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)
at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
at org.apache.hadoop.mapred.Child.main(Child.java:260)

Am i missing something basic here?

Thanks!

Ravi

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5359447B.8000708%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(ravimbhatt) #3

Thanks for your reply.

My java version is "openjdk version "1.7.0-internal"

Do you suggest trying a different version of java?

On Thursday, 24 April 2014 18:06:03 UTC+1, Costin Leau wrote:

Hi,

You're not - it looks like a bug inside the base64 function within the JDK
(note the input is generated by the same
class). Out of curiosity what JDK version are you using?

Cheers,

On 4/24/14 7:35 PM, ravim...@gmail.com <javascript:> wrote:

Hello All,

I have a very simple pig script.

set default_parallel 8;

REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar';

sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING PigStorage('\t') AS
(itemid:chararray, list:chararray);
a = LIMIT sets 100;
b = foreach a generate itemid;
STORE b INTO 'indexOne/iis' USING
org.elasticsearch.hadoop.pig.EsStorage('es.nodes=
ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis');

I get below error in my reducer:

java.lang.NullPointerException
at
javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)

    at 

javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)

    at 

javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)

    at 

javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)

    at 

org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)

    at 

org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)

    at 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)

    at 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)

    at 

org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)

    at 

org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)

    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:266) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at 

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)

    at org.apache.hadoop.mapred.Child.main(Child.java:260) 

Am i missing something basic here?

Thanks!

Ravi

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #4

I've committed a quick fix and pushed a dev build [1] - I recommend trying that instead which should work on your
platform as well.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/install.html#download-dev

P.S. For both Hadoop and ES, Sun/Oracle JDK is recommended.

On 4/24/14 8:09 PM, ravimbhatt@gmail.com wrote:

Thanks for your reply.

My java version is "openjdk version "1.7.0-internal"

Do you suggest trying a different version of java?

On Thursday, 24 April 2014 18:06:03 UTC+1, Costin Leau wrote:

Hi,

You're not - it looks like a bug inside the base64 function within the JDK (note the input is generated by the same
class). Out of curiosity what JDK version are you using?

Cheers,

On 4/24/14 7:35 PM, ravim...@gmail.com <javascript:> wrote:
> Hello All,
>
> I have a very simple pig script.
>
> set default_parallel 8;
>
> REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar';
>
> sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING PigStorage('\t')  AS (itemid:chararray, list:chararray);
> a = LIMIT sets 100;
> b = foreach a generate itemid;
> STORE b INTO 'indexOne/iis' USING
> org.elasticsearch.hadoop.pig.EsStorage('es.nodes=ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com
<http://ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com>','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis');

>
> I get below error in my reducer:
>
> java.lang.NullPointerException
>         at javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)
>         at javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)
>         at javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)
>         at javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)
>         at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)
>         at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:125)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
>         at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:568)
>         at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
>         at org.apache.hadoop.mapred.Child.main(Child.java:260)
>
>
> Am i missing something basic here?
>
>
> Thanks!
>
> Ravi
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53594935.2070002%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(ravimbhatt) #5

Hi Costin,

Thanks for the fix. I tried that and now i get a new error:

java.lang.NullPointerException
at org.codehaus.jackson.node.TextNode.getBinaryValue(TextNode.java:69)
at org.codehaus.jackson.node.TextNode.getBinaryValue(TextNode.java:160)
at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:71)
at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
at org.apache.hadoop.mapred.Child.main(Child.java:260)

may be missing some more null checks in the code? Can you please help?

Thanks!

Ravi

On Thursday, 24 April 2014 18:26:13 UTC+1, Costin Leau wrote:

I've committed a quick fix and pushed a dev build [1] - I recommend trying
that instead which should work on your
platform as well.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/install.html#download-dev

P.S. For both Hadoop and ES, Sun/Oracle JDK is recommended.

On 4/24/14 8:09 PM, ravim...@gmail.com <javascript:> wrote:

Thanks for your reply.

My java version is "openjdk version "1.7.0-internal"

Do you suggest trying a different version of java?

On Thursday, 24 April 2014 18:06:03 UTC+1, Costin Leau wrote:

Hi, 

You're not - it looks like a bug inside the base64 function within 

the JDK (note the input is generated by the same

class). Out of curiosity what JDK version are you using? 

Cheers, 

On 4/24/14 7:35 PM, ravim...@gmail.com <javascript:> wrote: 
> Hello All, 
> 
> I have a very simple pig script. 
> 
> set default_parallel 8; 
> 
> REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar'; 
> 
> sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING 

PigStorage('\t') AS (itemid:chararray, list:chararray);

> a = LIMIT sets 100; 
> b = foreach a generate itemid; 
> STORE b INTO 'indexOne/iis' USING 
> org.elasticsearch.hadoop.pig.EsStorage('es.nodes=

ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com

<http://ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com>','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis'); 
> 
> I get below error in my reducer: 
> 
> java.lang.NullPointerException 
>         at 

javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)

>         at 

javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)

>         at 

javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)

>         at 

javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)

>         at 

org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)

>         at 

org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)

>         at 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)

>         at 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)

>         at 

org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)

>         at 

org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)

>         at 

org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)

>         at org.apache.hadoop.mapred.Child$4.run(Child.java:266) 
>         at java.security.AccessController.doPrivileged(Native 

Method)

>         at javax.security.auth.Subject.doAs(Subject.java:415) 
>         at 

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)

>         at org.apache.hadoop.mapred.Child.main(Child.java:260) 
> 
> 
> Am i missing something basic here? 
> 
> 
> Thanks! 
> 
> Ravi 
> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google 

Groups "elasticsearch" group.

> To unsubscribe from this group and stop receiving emails from it, 

send an email to

>elasticsearc...@googlegroups.com <javascript:> <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:> <javascript:>>.

> To view this discussion on the web visit 
>

https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com>

> <

https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <

https://groups.google.com/d/optout>.

-- 
Costin 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/be613c65-9c01-4190-9b0a-f0d18500c39d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #6

Hi,

I've pushed another fix however it's probably just a band-aid. For some reason your Pig engine does not set the schema
(from b) on the OutputWriter.
I've tried reproducing the error but I couldn't - it would be great if you could join the IRC 1 to track
down the source of the problem.
By the way, what is the Pig version that you are using?

Cheers,

[1] http://www.elasticsearch.org/community/

On 4/25/14 12:22 AM, ravimbhatt@gmail.com wrote:

Hi Costin,

Thanks for the fix. I tried that and now i get a new error:

java.lang.NullPointerException
at org.codehaus.jackson.node.TextNode.getBinaryValue(TextNode.java:69)
at org.codehaus.jackson.node.TextNode.getBinaryValue(TextNode.java:160)
at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:71)
at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat.java:125)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
at org.apache.hadoop.mapred.Child.main(Child.java:260)

may be missing some more null checks in the code? Can you please help?

Thanks!

Ravi

On Thursday, 24 April 2014 18:26:13 UTC+1, Costin Leau wrote:

I've committed a quick fix and pushed a dev build [1] - I recommend trying that instead which should work on your
platform as well.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/install.html#download-dev
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/install.html#download-dev>

P.S. For both Hadoop and ES, Sun/Oracle JDK is recommended.

On 4/24/14 8:09 PM, ravim...@gmail.com <javascript:> wrote:
> Thanks for your reply.
>
> My java version is "openjdk version "1.7.0-internal"
>
> Do you suggest trying a different version of java?
>
> On Thursday, 24 April 2014 18:06:03 UTC+1, Costin Leau wrote:
>
>     Hi,
>
>     You're not - it looks like a bug inside the base64 function within the JDK (note the input is generated by the same
>     class). Out of curiosity what JDK version are you using?
>
>     Cheers,
>
>     On 4/24/14 7:35 PM,ravim...@gmail.com <javascript:> wrote:
>     > Hello All,
>     >
>     > I have a very simple pig script.
>     >
>     > set default_parallel 8;
>     >
>     > REGISTER 'elasticsearch-hadoop-pig-1.3.0.M3.jar';
>     >
>     > sets = LOAD 'wasb://yyyyyy@xxxxxxxxxxxxxxxx' USING PigStorage('\t')  AS (itemid:chararray, list:chararray);
>     > a = LIMIT sets 100;
>     > b = foreach a generate itemid;
>     > STORE b INTO 'indexOne/iis' USING
>     > org.elasticsearch.hadoop.pig.EsStorage('es.nodes=ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com
<http://ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com>
>     <http://ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com
<http://ec2-xx-xx-xxx-xxx.eu-west-1.compute.amazonaws.com>>','es.mapping.names=itemid:@itemid','es.resource=indexOne/iis');

>
>     >
>     > I get below error in my reducer:
>     >
>     > java.lang.NullPointerException
>     >         at javax.xml.bind.DatatypeConverterImpl.guessLength(DatatypeConverterImpl.java:653)
>     >         at javax.xml.bind.DatatypeConverterImpl._parseBase64Binary(DatatypeConverterImpl.java:691)
>     >         at javax.xml.bind.DatatypeConverterImpl.parseBase64Binary(DatatypeConverterImpl.java:433)
>     >         at javax.xml.bind.DatatypeConverter.parseBase64Binary(DatatypeConverter.java:342)
>     >         at org.elasticsearch.hadoop.util.IOUtils.deserializeFromBase64(IOUtils.java:56)
>     >         at org.elasticsearch.hadoop.pig.EsStorage.prepareToWrite(EsStorage.java:178)
>     >         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:125)
>     >         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
>     >         at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:568)
>     >         at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
>     >         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>     >         at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>     >         at java.security.AccessController.doPrivileged(Native Method)
>     >         at javax.security.auth.Subject.doAs(Subject.java:415)
>     >         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)
>     >         at org.apache.hadoop.mapred.Child.main(Child.java:260)
>     >
>     >
>     > Am i missing something basic here?
>     >
>     >
>     > Thanks!
>     >
>     > Ravi
>     >
>     >
>     >
>     >
>     > --
>     > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
>     > To unsubscribe from this group and stop receiving emails from it, send an email to
>     >elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
>     > To view this discussion on the web visit
>     >https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com>>
>     > <https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/e6c6ba6d-012b-49b0-b6a7-785e001d34b9%40googlegroups.com?utm_medium=email&utm_source=footer>>>.

>
>     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout> <https://groups.google.com/d/optout
<https://groups.google.com/d/optout>>.
>
>     --
>     Costin
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/1df51b7e-2cb2-44d9-8a9b-dfb493a352e2%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/be613c65-9c01-4190-9b0a-f0d18500c39d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/be613c65-9c01-4190-9b0a-f0d18500c39d%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/535A5FE7.6010501%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7