Unable to write data to elasticsearch using hadoop PIG


(siva mannem) #1

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON file
as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl XPUT
command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser
  • ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "mannem/siva"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser
  • ERROR 2997: Encountered IOException. Out of nodes and retries; caught
    exception

    Details at logfile:
    /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache hadoop
(hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4740e2a-868f-489c-8f9d-842c08ecddff%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(siva mannem) #2

On Tuesday, March 4, 2014 9:32:55 PM UTC-8, siva mannem wrote:

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON
file as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl XPUT
command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Encountered
    IOException. Out of nodes and retries; caught exception

    Details at logfile:
    /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache
hadoop (hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7f32cae4-6ba4-4a57-889a-0cd826f69d09%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Yann Barraud) #3

Hi,

Is your ES instance known by your Hadoop cluster (/etc/hosts) ?

It does not even seems to read in it.

Cheers,
Yann

Le mercredi 5 mars 2014 06:32:55 UTC+1, siva mannem a écrit :

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON
file as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl XPUT
command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "mannem/siva"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Encountered
    IOException. Out of nodes and retries; caught exception

    Details at logfile:
    /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache
hadoop (hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3cbde444-460e-4f14-bb16-f160feaf9a58%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Costin Leau) #4

The error indicates that Pig cannot access Elasticsearch. Make sure that you specify the proper ES IP and port in your
configuration - the defaults localhost:9200 work only if you are running on a local node or if you have Elasticsearch
running on each node of your Hadoop cluster.

Also when loading Json files, don't use JsonLoader but rather PigStorage - JsonLoader purpose is to transform JSON into
objects but that is not needed.
es-hadoop already does that and if you're data is in json, it will stream the data as is to ES.

Again, I recommend you go through the reference documentation:
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/pig.html
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/mapping.html

On 3/5/2014 7:32 AM, siva mannem wrote:

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway server and i am able to run some basic curl
commands like XPUT and XGET to create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON file as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl XPUT command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
Failed!
2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "mannem/siva"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
Failed!
2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Encountered IOException. Out
of nodes and retries; caught exception

Details at logfile: /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache hadoop (hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f4740e2a-868f-489c-8f9d-842c08ecddff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f4740e2a-868f-489c-8f9d-842c08ecddff%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/groups/opt_out.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53170168.5000909%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(siva mannem) #5

Yann and Costin,
thank you so much for your quick reply.
now i am able to write data to ES from PIG and also read data from ES
using PIG.
i modified my DEFINE statement in PIG script as follows :
++++++++++
DEFINE ESTOR
org.elasticsearch.hadoop.pig.EsStorage('es.nodes=gateway1,es.resource=usa/ca');
++++++++++++

On Tuesday, March 4, 2014 9:32:55 PM UTC-8, siva mannem wrote:

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON
file as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl XPUT
command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "mannem/siva"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Encountered
    IOException. Out of nodes and retries; caught exception

    Details at logfile:
    /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache
hadoop (hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/18960477-da43-4969-8571-02942ab77390%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(hanine) #6

I had get the same erreur but I don't know what I have to change in my
"/etc/hosts"
thank you for your help

Le mercredi 5 mars 2014 09:39:46 UTC, Yann Barraud a écrit :

Hi,

Is your ES instance known by your Hadoop cluster (/etc/hosts) ?

It does not even seems to read in it.

Cheers,
Yann

Le mercredi 5 mars 2014 06:32:55 UTC+1, siva mannem a écrit :

I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON
file as input to curl XPUT .
can anybody give me the syntax for giving JSON file as input for curl
XPUT command?

my next issue is i copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar

to /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
and /usr/lib/gphd/pig/

i have the following json file j.json
++++++
{"k1":"v1" , "k2":"v2" , "k3":"v3"}
++++++++

in my_hdfs_path.

my pig script is write_data_to_es.pig
+++++++++++++
REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
A = LOAD '/my_hdfs_path/j.json' using
JsonLoader('k1:chararray,k2:chararray,k3:chararray');
STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
++++++++++++++

when i run my pig script
+++++++++
pig -x mapreduce write_data_to_es.pig
++++++++++++

i am getting following error
+++++++++
Input(s):
Failed to read data from "/my_hdfs_path/j.json"

Output(s):
Failed to produce result in "usa/ca"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

Output(s):
Failed to produce result in "mannem/siva"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1390436301987_0089

2014-03-05 00:26:50,839 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR
    org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Encountered
    IOException. Out of nodes and retries; caught exception

    Details at logfile:
    /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

i am using pivotal hadoop version (1.0.1) which is basically apache
hadoop (hadoop-2.0.2)
and pig version is 0.10.1
and elastic search version is 0.90.9

can anybody help me out here?
thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #7

Check your network settings and make sure that the Hadoop nodes can communicate with the ES nodes.
If you install ES besides Hadoop itself, this shouldn't be a problem.
There are various way to check this - try ping, tracert, etc...

Please refer to your distro manual/documentation for more information about the configuration and setup.

Cheers,

On 5/12/14 3:42 PM, hanine haninne wrote:

I had get the same erreur but I don't know what I have to change in my "/etc/hosts"
thank you for your help

Le mercredi 5 mars 2014 09:39:46 UTC, Yann Barraud a écrit :

Hi,

Is your ES instance known by your Hadoop cluster (/etc/hosts) ?

It does not even seems to read in it.

Cheers,
Yann

Le mercredi 5 mars 2014 06:32:55 UTC+1, siva mannem a écrit :

    I installed ES(at the location /usr/lib/elasticsearch/) on our gateway server and i am able to run some basic
    curl commands like XPUT and XGET to create some indices and retrieve the data in them.
    i am able to give single line JSON record but i am unable to give JSON file as input to curl XPUT .
    can anybody give me the syntax for giving JSON file as input for curl XPUT command?

    my next issue is i copied  the following 4 elasticsearch-hadoop jar files
    elasticsearch-hadoop-1.3.0.M2.jar
    elasticsearch-hadoop-1.3.0.M2-sources.jar
    elasticsearch-hadoop-1.3.0.M2-javadoc.jar
    elasticsearch-hadoop-1.3.0.M2-yarn.jar

    to  /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
    and /usr/lib/gphd/pig/

    i have the following json file j.json
    ++++++
    {"k1":"v1" ,  "k2":"v2" , "k3":"v3"}
    ++++++++

    in my_hdfs_path.

    my pig script is write_data_to_es.pig
    +++++++++++++
    REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
    DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
    A = LOAD '/my_hdfs_path/j.json' using JsonLoader('k1:chararray,k2:chararray,k3:chararray');
    STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
    ++++++++++++++

    when i run my pig script
    +++++++++
    pig -x mapreduce  write_data_to_es.pig
    ++++++++++++

    i am getting following error
    +++++++++
    Input(s):
    Failed to read data from "/my_hdfs_path/j.json"

    Output(s):
    Failed to produce result in "usa/ca"

    Counters:
    Total records written : 0
    Total bytes written : 0
    Spillable Memory Manager spill count : 0
    Total bags proactively spilled: 0
    Total records proactively spilled: 0

    Job DAG:
    job_1390436301987_0089


    2014-03-05 00:26:50,839 [main] INFO
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
    2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
    Failed to read data from "/elastic_search/es_hadoop_test.json"

    Output(s):
    Failed to produce result in "mannem/siva"

    Counters:
    Total records written : 0
    Total bytes written : 0
    Spillable Memory Manager spill count : 0
    Total bags proactively spilled: 0
    Total records proactively spilled: 0

    Job DAG:
    job_1390436301987_0089

    2014-03-05 00:26:50,839 [main] INFO
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
    2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - *ERROR 2997: Encountered
    IOException. Out of nodes and retries; caught exception*
    Details at logfile: /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
    ++++++++++++

    i am using pivotal hadoop version (1.0.1)  which is basically apache hadoop (hadoop-2.0.2)
    and pig version is 0.10.1
    and elastic search version is 0.90.9

    can anybody help me out here?
    thank you so much in advance for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5370E894.8070308%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(hanine) #8

thank you so much for your quick reply,
Here is what I had done
1-instaled hadoop-1.2.1( pig-0.12.0 / hive-0.11.0 /...)
2-download Elasticsearch-1.0.1 and put it in the same file of hadoop
3-copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar
to /pig and hadoop/lib
4- Add them in the PIG_CLASSPATH

knowing that when I take data from my Desktop and put it in elasticsearch
using pig script it works very well, but when I try to get data from my
HDFS it gives me that :

2014-05-12 23:16:31,765 [main] ERROR
org.apache.pig.tools.pigstats.SimplePigStats - ERROR: java.io.IOException:
Out of nodes and retries; caught exception
2014-05-12 23:16:31,765 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-05-12 23:16:31,766 [main] INFO
org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.2.1 0.12.0 hduser 2014-05-12 23:15:34 2014-05-12 23:16:31
GROUP_BY

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_201405122310_0001 weblog_count,weblog_group,weblogs
GROUP_BY,COMBINER Message: Job failed! Error - # of failed Reduce Tasks
exceeded allowed limit. FailedCount: 1. LastFailedTask:
task_201405122310_0001_r_000000 weblogs1/logs2,

Input(s):
Failed to read data from "/user/weblogs"

Output(s):
Failed to produce result in "weblogs1/logs2"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201405122310_0001

2014-05-12 23:16:31,766 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!

And here is the script :

weblogs = LOAD '/user/weblogs' USING PigStorage('\t')
AS (client_ip : chararray,
full_request_date : chararray,
day : int,
month : chararray,
month_num : int,
year : int,
hour : int,
minute : int,
second : int,
timezone : chararray,
http_verb : chararray,
uri : chararray,
http_status_code : chararray,
bytes_returned : chararray,
referrer : chararray,
user_agent : chararray
);
weblog_group = GROUP weblogs by (client_ip, year, month_num);
weblog_count = FOREACH weblog_group GENERATE group.client_ip, group.year,
group.month_num, COUNT_STAR(weblogs) as pageviews;
STORE weblog_count INTO 'weblogs1/logs2' USING
org.elasticsearch.hadoop.pig.EsStorage();

Le lundi 12 mai 2014 16:28:20 UTC+1, Costin Leau a écrit :

Check your network settings and make sure that the Hadoop nodes can
communicate with the ES nodes.
If you install ES besides Hadoop itself, this shouldn't be a problem.
There are various way to check this - try ping, tracert, etc...

Please refer to your distro manual/documentation for more information
about the configuration and setup.

Cheers,

On 5/12/14 3:42 PM, hanine haninne wrote:

I had get the same erreur but I don't know what I have to change in my
"/etc/hosts"
thank you for your help

Le mercredi 5 mars 2014 09:39:46 UTC, Yann Barraud a écrit :

Hi, 

Is your ES instance known by your Hadoop cluster (/etc/hosts) ? 

It does not even seems to read in it. 

Cheers, 
Yann 

Le mercredi 5 mars 2014 06:32:55 UTC+1, siva mannem a écrit : 

    I installed ES(at the location /usr/lib/elasticsearch/) on our 

gateway server and i am able to run some basic

    curl commands like XPUT and XGET to create some indices and 

retrieve the data in them.

    i am able to give single line JSON record but i am unable to 

give JSON file as input to curl XPUT .

    can anybody give me the syntax for giving JSON file as input for 

curl XPUT command?

    my next issue is i copied  the following 4 elasticsearch-hadoop 

jar files

    elasticsearch-hadoop-1.3.0.M2.jar 
    elasticsearch-hadoop-1.3.0.M2-sources.jar 
    elasticsearch-hadoop-1.3.0.M2-javadoc.jar 
    elasticsearch-hadoop-1.3.0.M2-yarn.jar 

    to  /usr/lib/elasticsearch/elasticsearch-0.90.9/lib 
    and /usr/lib/gphd/pig/ 

    i have the following json file j.json 
    ++++++ 
    {"k1":"v1" ,  "k2":"v2" , "k3":"v3"} 
    ++++++++ 

    in my_hdfs_path. 

    my pig script is write_data_to_es.pig 
    +++++++++++++ 
    REGISTER 

/usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;

    DEFINE ESTOR 

org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');

    A = LOAD '/my_hdfs_path/j.json' using 

JsonLoader('k1:chararray,k2:chararray,k3:chararray');

    STORE A into 'usa/ca' USING ESTOR('es.input.json=true'); 
    ++++++++++++++ 

    when i run my pig script 
    +++++++++ 
    pig -x mapreduce  write_data_to_es.pig 
    ++++++++++++ 

    i am getting following error 
    +++++++++ 
    Input(s): 
    Failed to read data from "/my_hdfs_path/j.json" 

    Output(s): 
    Failed to produce result in "usa/ca" 

    Counters: 
    Total records written : 0 
    Total bytes written : 0 
    Spillable Memory Manager spill count : 0 
    Total bags proactively spilled: 0 
    Total records proactively spilled: 0 

    Job DAG: 
    job_1390436301987_0089 


    2014-03-05 00:26:50,839 [main] INFO 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR 

org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):

    Failed to read data from "/elastic_search/es_hadoop_test.json" 

    Output(s): 
    Failed to produce result in "mannem/siva" 

    Counters: 
    Total records written : 0 
    Total bytes written : 0 
    Spillable Memory Manager spill count : 0 
    Total bags proactively spilled: 0 
    Total records proactively spilled: 0 

    Job DAG: 
    job_1390436301987_0089 

    2014-03-05 00:26:50,839 [main] INFO 

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

  • Failed!
    2014-03-05 00:26:50,841 [main] ERROR 

org.apache.pig.tools.grunt.GruntParser - *ERROR 2997: Encountered

    IOException. Out of nodes and retries; caught exception* 
    Details at logfile: 

/usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log

    ++++++++++++ 

    i am using pivotal hadoop version (1.0.1)  which is basically 

apache hadoop (hadoop-2.0.2)

    and pig version is 0.10.1 
    and elastic search version is 0.90.9 

    can anybody help me out here? 
    thank you so much in advance for your help. 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cd9d3143-556a-43c8-9cfd-78b666db48b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #9

I would recommend upgrading to the latest es-hadoop, 2.0 RC1.
Also considering reading [1]

Hope this helps,
[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/troubleshooting.html

On 5/13/14 1:20 AM, hanine haninne wrote:

thank you so much for your quick reply,
Here is what I had done
1-instaled hadoop-1.2.1( pig-0.12.0 / hive-0.11.0 /...)
2-download Elasticsearch-1.0.1 and put it in the same file of hadoop
3-copied the following 4 elasticsearch-hadoop jar files
elasticsearch-hadoop-1.3.0.M2.jar
elasticsearch-hadoop-1.3.0.M2-sources.jar
elasticsearch-hadoop-1.3.0.M2-javadoc.jar
elasticsearch-hadoop-1.3.0.M2-yarn.jar
to /pig and hadoop/lib
4- Add them in the PIG_CLASSPATH

knowing that when I take data from my Desktop and put it in elasticsearch using pig script it works very well, but when
I try to get data from my HDFS it gives me that :

2014-05-12 23:16:31,765 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: java.io.IOException: Out of
nodes and retries; caught exception
2014-05-12 23:16:31,765 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-05-12 23:16:31,766 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.2.1 0.12.0 hduser 2014-05-12 23:15:34 2014-05-12 23:16:31 GROUP_BY

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_201405122310_0001 weblog_count,weblog_group,weblogs GROUP_BY,COMBINER Message: Job failed! Error - # of
failed Reduce Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201405122310_0001_r_000000
weblogs1/logs2,

Input(s):
Failed to read data from "/user/weblogs"

Output(s):
Failed to produce result in "weblogs1/logs2"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201405122310_0001

2014-05-12 23:16:31,766 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
Failed!

And here is the script :

weblogs = LOAD '/user/weblogs' USING PigStorage('\t')
AS (client_ip : chararray,
full_request_date : chararray,
day : int,
month : chararray,
month_num : int,
year : int,
hour : int,
minute : int,
second : int,
timezone : chararray,
http_verb : chararray,
uri : chararray,
http_status_code : chararray,
bytes_returned : chararray,
referrer : chararray,
user_agent : chararray
);
weblog_group = GROUP weblogs by (client_ip, year, month_num);
weblog_count = FOREACH weblog_group GENERATE group.client_ip, group.year, group.month_num, COUNT_STAR(weblogs) as pageviews;
STORE weblog_count INTO 'weblogs1/logs2' USING org.elasticsearch.hadoop.pig.EsStorage();

Le lundi 12 mai 2014 16:28:20 UTC+1, Costin Leau a écrit :

Check your network settings and make sure that the Hadoop nodes can communicate with the ES nodes.
If you install ES besides Hadoop itself, this shouldn't be a problem.
There are various way to check this - try ping, tracert, etc...

Please refer to your distro manual/documentation for more information about the configuration and setup.

Cheers,

On 5/12/14 3:42 PM, hanine haninne wrote:
> I had get the same erreur but I don't know what I have to change in my "/etc/hosts"
> thank you for your help
>
> Le mercredi 5 mars 2014 09:39:46 UTC, Yann Barraud a écrit :
>
>     Hi,
>
>     Is your ES instance known by your Hadoop cluster (/etc/hosts) ?
>
>     It does not even seems to read in it.
>
>     Cheers,
>     Yann
>
>     Le mercredi 5 mars 2014 06:32:55 UTC+1, siva mannem a écrit :
>
>         I installed ES(at the location /usr/lib/elasticsearch/) on our gateway server and i am able to run some basic
>         curl commands like XPUT and XGET to create some indices and retrieve the data in them.
>         i am able to give single line JSON record but i am unable to give JSON file as input to curl XPUT .
>         can anybody give me the syntax for giving JSON file as input for curl XPUT command?
>
>         my next issue is i copied  the following 4 elasticsearch-hadoop jar files
>         elasticsearch-hadoop-1.3.0.M2.jar
>         elasticsearch-hadoop-1.3.0.M2-sources.jar
>         elasticsearch-hadoop-1.3.0.M2-javadoc.jar
>         elasticsearch-hadoop-1.3.0.M2-yarn.jar
>
>         to  /usr/lib/elasticsearch/elasticsearch-0.90.9/lib
>         and /usr/lib/gphd/pig/
>
>         i have the following json file j.json
>         ++++++
>         {"k1":"v1" ,  "k2":"v2" , "k3":"v3"}
>         ++++++++
>
>         in my_hdfs_path.
>
>         my pig script is write_data_to_es.pig
>         +++++++++++++
>         REGISTER /usr/lib/gphd/pig/elasticsearch-hadoop-1.3.0.M2-yarn.jar;
>         DEFINE ESTOR org.elasticsearch.hadoop.pig.EsStorage('es.resource=usa/ca');
>         A = LOAD '/my_hdfs_path/j.json' using JsonLoader('k1:chararray,k2:chararray,k3:chararray');
>         STORE A into 'usa/ca' USING ESTOR('es.input.json=true');
>         ++++++++++++++
>
>         when i run my pig script
>         +++++++++
>         pig -x mapreduce  write_data_to_es.pig
>         ++++++++++++
>
>         i am getting following error
>         +++++++++
>         Input(s):
>         Failed to read data from "/my_hdfs_path/j.json"
>
>         Output(s):
>         Failed to produce result in "usa/ca"
>
>         Counters:
>         Total records written : 0
>         Total bytes written : 0
>         Spillable Memory Manager spill count : 0
>         Total bags proactively spilled: 0
>         Total records proactively spilled: 0
>
>         Job DAG:
>         job_1390436301987_0089
>
>
>         2014-03-05 00:26:50,839 [main] INFO
>           org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>         2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Input(s):
>         Failed to read data from "/elastic_search/es_hadoop_test.json"
>
>         Output(s):
>         Failed to produce result in "mannem/siva"
>
>         Counters:
>         Total records written : 0
>         Total bytes written : 0
>         Spillable Memory Manager spill count : 0
>         Total bags proactively spilled: 0
>         Total records proactively spilled: 0
>
>         Job DAG:
>         job_1390436301987_0089
>
>         2014-03-05 00:26:50,839 [main] INFO
>           org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>         2014-03-05 00:26:50,841 [main] ERROR org.apache.pig.tools.grunt.GruntParser - *ERROR 2997: Encountered
>         IOException. Out of nodes and retries; caught exception*
>         Details at logfile: /usr/lib/elasticsearch/elasticsearch-0.90.9/pig_1393997175206.log
>         ++++++++++++
>
>         i am using pivotal hadoop version (1.0.1)  which is basically apache hadoop (hadoop-2.0.2)
>         and pig version is 0.10.1
>         and elastic search version is 0.90.9
>
>         can anybody help me out here?
>         thank you so much in advance for your help.
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/1dd8ff7d-ef53-4614-9300-13b5f6ed66fa%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cd9d3143-556a-43c8-9cfd-78b666db48b7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cd9d3143-556a-43c8-9cfd-78b666db48b7%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5371EBC9.40204%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #10