Unable to index using elastic-hadoop plugin


(Chetana) #1

I am using elasticsearch-hadoop plugin (
https://github.com/elasticsearch/elasticsearch-hadoop) and trying to index
some documents. Iam using elasticsearch version 0.90.2 and Hadoop
Hortonworks 2.2.0. The search functionality works fine, but while
indexing application hangs

The json file location is passed as a command line arguemnt and below is
the indexing code snippet

Configuration conf = new Configuration();
conf.setBoolean("mapred.map.tasks.speculative.execution", false);
conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
conf.setInt("mapred.min.split.size",40);
conf.set("es.resource", "test/test");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");
conf.set("es.input.json", "yes");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");

Job job = Job.getInstance(conf);
job.setMapperClass(Mapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);

Path jarPath = new Path(args[0]);
FileSystem fs = FileSystem.get(conf);
Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
fs.copyFromLocalFile(false, true, jarPath, dst);
FileInputFormat.setInputPaths(job, dst);

job.waitForCompletion(true);

Am I missing anything, pls help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #2

Hi,

If I understand correctly, you can read data from ES (through es-hadoop) but you cannot write to it - am I correct? Can
you confirm that you are using the latest es-hadoop, namely 1.3.0.M3?
How big is the JSON file you are trying to index? Do you see any activity in the console?

There are various ways in which you can monitor activity - in ES you can monitor the console or use Marvel [1], in
ES-hadoop you can enable logging [3] and see how the job progresses. Try starting with a small file to have a short
feedback loop and once things get ironed out, try your actual desired file.

A few notes:

  • In general we recommend using the latest stable version of Elasticsearch - 0.90.2 is quite old and unless you have a
    strong reason to stay on it, I highly recommend upgrading to 1.1.1 or, in the worst case scenario, ES 0.90.13.
  • you have repeating code - you set "es.nodes" and "es.port" twice
  • you are copying the local file to the destination filesystem (presumably HDFS) which is okay but typically this is
    done outside the job launch
  • if you are using Hadoop MRv2, consider switching to MRv1. es-hadoop supports both modes but the latter is easier to
    use and still the one the vendors recommend by default
  • since you are using Hadoop 2 from Hortonworks, you might want to upgrade to their latest HDP (2.1) release.

Hope this helps,

[1] http://www.elasticsearch.org/overview/marvel/
[2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

On 4/24/14 7:30 AM, Chetana wrote:

I am using elasticsearch-hadoop plugin (https://github.com/elasticsearch/elasticsearch-hadoop) and trying to index some
documents. Iam using elasticsearch version 0.90.2 and Hadoop Hortonworks 2.2.0. The search functionality works fine, but
while indexing application hangs
https://github.com/elasticsearch/elasticsearch-hadoop
The json file location is passed as a command line arguemnt and below is the indexing code snippet
Configuration conf = new Configuration();
conf.setBoolean("mapred.map.tasks.speculative.execution", false);
conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
conf.setInt("mapred.min.split.size",40);
conf.set("es.resource", "test/test");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");
conf.set("es.input.json", "yes");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");

Job job = Job.getInstance(conf);
job.setMapperClass(Mapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
Path jarPath = new Path(args[0]);
FileSystem fs = FileSystem.get(conf);
Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
fs.copyFromLocalFile(false, true, jarPath, dst);
FileInputFormat.setInputPaths(job, dst);

job.waitForCompletion(true);
Am I missing anything, pls help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53589A66.4000805%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Chetana) #3

Yes, I am able to search but not able to index. I am using 1.3.0.M2. The
json file size is just 28bytes.
I corrected port/host and some log setting. With all these chnages I am to
run the application, but the indexing is still not happening
I dont see any error messages in log files.

I am using a single node cluster and running both elastic search and hadoop
on the same system.

On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau wrote:

Hi,

If I understand correctly, you can read data from ES (through es-hadoop)
but you cannot write to it - am I correct? Can
you confirm that you are using the latest es-hadoop, namely 1.3.0.M3?
How big is the JSON file you are trying to index? Do you see any activity
in the console?

There are various ways in which you can monitor activity - in ES you can
monitor the console or use Marvel [1], in
ES-hadoop you can enable logging [3] and see how the job progresses. Try
starting with a small file to have a short
feedback loop and once things get ironed out, try your actual desired
file.

A few notes:

  • In general we recommend using the latest stable version of Elasticsearch
  • 0.90.2 is quite old and unless you have a
    strong reason to stay on it, I highly recommend upgrading to 1.1.1 or, in
    the worst case scenario, ES 0.90.13.
  • you have repeating code - you set "es.nodes" and "es.port" twice
  • you are copying the local file to the destination filesystem (presumably
    HDFS) which is okay but typically this is
    done outside the job launch
  • if you are using Hadoop MRv2, consider switching to MRv1. es-hadoop
    supports both modes but the latter is easier to
    use and still the one the vendors recommend by default
  • since you are using Hadoop 2 from Hortonworks, you might want to upgrade
    to their latest HDP (2.1) release.

Hope this helps,

[1] http://www.elasticsearch.org/overview/marvel/
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

On 4/24/14 7:30 AM, Chetana wrote:

I am using elasticsearch-hadoop plugin (
https://github.com/elasticsearch/elasticsearch-hadoop) and trying to
index some
documents. Iam using elasticsearch version 0.90.2 and Hadoop Hortonworks
2.2.0. The search functionality works fine, but
while indexing application hangs
https://github.com/elasticsearch/elasticsearch-hadoop
The json file location is passed as a command line arguemnt and below is
the indexing code snippet
Configuration conf = new Configuration();
conf.setBoolean("mapred.map.tasks.speculative.execution", false);
conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
conf.setInt("mapred.min.split.size",40);
conf.set("es.resource", "test/test");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");
conf.set("es.input.json", "yes");
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");

Job job = Job.getInstance(conf);
job.setMapperClass(Mapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
Path jarPath = new Path(args[0]);
FileSystem fs = FileSystem.get(conf);
Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
fs.copyFromLocalFile(false, true, jarPath, dst);
FileInputFormat.setInputPaths(job, dst);

job.waitForCompletion(true);
Am I missing anything, pls help

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #4

Try es-hadoop 1.3.0.M3 - if you enable logging as indicated in the docs I've mentioned you should be seeing all the
activity - namely the connection being made, the data being transmitted, the reply, etc....

On 4/24/14 12:21 PM, Chetana wrote:

Yes, I am able to search but not able to index. I am using 1.3.0.M2. The json file size is just 28bytes.
I corrected port/host and some log setting. With all these chnages I am to run the application, but the indexing is
still not happening
I dont see any error messages in log files.
I am using a single node cluster and running both elastic search and hadoop on the same system.

On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau wrote:

Hi,

If I understand correctly, you can read data from ES (through es-hadoop) but you cannot write to it - am I correct? Can
you confirm that you are using the latest es-hadoop, namely 1.3.0.M3?
How big is the JSON file you are trying to index? Do you see any activity in the console?

There are various ways in which you can monitor activity - in ES you can monitor the console or use Marvel [1], in
ES-hadoop you can enable logging [3] and see how the job progresses. Try starting with a small file to have a short
feedback loop and once things get ironed out, try your actual desired file.


A few notes:
- In general we recommend using the latest stable version of Elasticsearch - 0.90.2 is quite old and unless you have a
strong reason to stay on it, I highly recommend upgrading to 1.1.1 or, in the worst case scenario, ES 0.90.13.
- you have repeating code - you set "es.nodes" and "es.port" twice
- you are copying the local file to the destination filesystem (presumably HDFS) which is okay but typically this is
done outside the job launch
- if you are using Hadoop MRv2, consider switching to MRv1. es-hadoop supports both modes but the latter is easier to
use and still the one the vendors recommend by default
- since you are using Hadoop 2 from Hortonworks, you might want to upgrade to their latest HDP (2.1) release.

Hope this helps,

[1] http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>
[2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>

On 4/24/14 7:30 AM, Chetana wrote:
> I am using elasticsearch-hadoop plugin (https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>) and
trying to index some
> documents. Iam using elasticsearch version 0.90.2 and Hadoop Hortonworks 2.2.0. The search functionality works fine, but
> while indexing application hangs
>https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
> The json file location is passed as a command line arguemnt and below is the indexing code snippet
> Configuration conf = new Configuration();
>    conf.setBoolean("mapred.map.tasks.speculative.execution", false);
>    conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
>    conf.setInt("mapred.min.split.size",40);
>    conf.set("es.resource", "test/test");
>    conf.set("es.nodes", "localhost");
>    conf.set("es.port", "9200");
>    conf.set("es.input.json", "yes");
>    conf.set("es.nodes", "localhost");
>    conf.set("es.port", "9200");
>
>    Job job = Job.getInstance(conf);
>    job.setMapperClass(Mapper.class);
>    job.setInputFormatClass(TextInputFormat.class);
>    job.setOutputFormatClass(EsOutputFormat.class);
>    job.setMapOutputKeyClass(LongWritable.class);
>    job.setMapOutputValueClass(Text.class);
> Path jarPath = new Path(args[0]);
> FileSystem fs = FileSystem.get(conf);
>      Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
> fs.copyFromLocalFile(false, true, jarPath, dst);
>    FileInputFormat.setInputPaths(job, dst);
>
>    job.waitForCompletion(true);
> Am I missing anything, pls help
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5358E2A6.1020105%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Chetana) #5

I am using now ES 1.1.1 and as before es-hadoop 1.3.0.M3. But even with
latest ES I am unable to index.
I don't see any log pertaining to es-hadoop in any of the log file and also
not on console. Also there is no exception while running the job and the
job completes successfully

But if elasticsearch server is not running, the job throws an exception
and does not run

/hadoop-yarn/staging/user1/.staging/job_1398663730568_0001
14/04/28 11:18:27 ERROR security.UserGroupInformation:
PriviledgedActionException as:user1 (auth:SIMPLE)
cause:java.io.IOException: Out of nodes and retries; caught exception
Exception in thread "main" java.io.IOException: Out of nodes and retries;
caught exception
at
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)

Please someone suggest me how to troubleshoot this issue. After this excise
I need to index user log content to ES

Thanks,

On Thursday, April 24, 2014 3:38:38 PM UTC+5:30, Costin Leau wrote:

Try es-hadoop 1.3.0.M3 - if you enable logging as indicated in the docs
I've mentioned you should be seeing all the
activity - namely the connection being made, the data being transmitted,
the reply, etc....

On 4/24/14 12:21 PM, Chetana wrote:

Yes, I am able to search but not able to index. I am using 1.3.0.M2. The
json file size is just 28bytes.
I corrected port/host and some log setting. With all these chnages I am
to run the application, but the indexing is
still not happening
I dont see any error messages in log files.
I am using a single node cluster and running both elastic search and
hadoop on the same system.

On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau wrote:

Hi, 

If I understand correctly, you can read data from ES (through 

es-hadoop) but you cannot write to it - am I correct? Can

you confirm that you are using the latest es-hadoop, namely 

1.3.0.M3?

How big is the JSON file you are trying to index? Do you see any 

activity in the console?

There are various ways in which you can monitor activity - in ES you 

can monitor the console or use Marvel [1], in

ES-hadoop you can enable logging [3] and see how the job progresses. 

Try starting with a small file to have a short

feedback loop and once things get ironed out, try your actual 

desired file.

A few notes: 
- In general we recommend using the latest stable version of 

Elasticsearch - 0.90.2 is quite old and unless you have a

strong reason to stay on it, I highly recommend upgrading to 1.1.1 

or, in the worst case scenario, ES 0.90.13.

- you have repeating code - you set "es.nodes" and "es.port" twice 
- you are copying the local file to the destination filesystem 

(presumably HDFS) which is okay but typically this is

done outside the job launch 
- if you are using Hadoop MRv2, consider switching to MRv1. 

es-hadoop supports both modes but the latter is easier to

use and still the one the vendors recommend by default 
- since you are using Hadoop 2 from Hortonworks, you might want to 

upgrade to their latest HDP (2.1) release.

Hope this helps, 

[1] http://www.elasticsearch.org/overview/marvel/ <

http://www.elasticsearch.org/overview/marvel/>

[2] 

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

<

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>

On 4/24/14 7:30 AM, Chetana wrote: 
> I am using elasticsearch-hadoop plugin (

https://github.com/elasticsearch/elasticsearch-hadoop <
https://github.com/elasticsearch/elasticsearch-hadoop>) and

trying to index some 
> documents. Iam using elasticsearch version 0.90.2 and Hadoop 

Hortonworks 2.2.0. The search functionality works fine, but

> while indexing application hangs 
>https://github.com/elasticsearch/elasticsearch-hadoop <

https://github.com/elasticsearch/elasticsearch-hadoop>

> The json file location is passed as a command line arguemnt and 

below is the indexing code snippet

> Configuration conf = new Configuration(); 
>    conf.setBoolean("mapred.map.tasks.speculative.execution", 

false);

>    conf.setBoolean("mapred.reduce.tasks.speculative.execution", 

false);

>    conf.setInt("mapred.min.split.size",40); 
>    conf.set("es.resource", "test/test"); 
>    conf.set("es.nodes", "localhost"); 
>    conf.set("es.port", "9200"); 
>    conf.set("es.input.json", "yes"); 
>    conf.set("es.nodes", "localhost"); 
>    conf.set("es.port", "9200"); 
> 
>    Job job = Job.getInstance(conf); 
>    job.setMapperClass(Mapper.class); 
>    job.setInputFormatClass(TextInputFormat.class); 
>    job.setOutputFormatClass(EsOutputFormat.class); 
>    job.setMapOutputKeyClass(LongWritable.class); 
>    job.setMapOutputValueClass(Text.class); 
> Path jarPath = new Path(args[0]); 
> FileSystem fs = FileSystem.get(conf); 
>      Path dst = new Path(fs.getHomeDirectory(), 

jarPath.getName());

> fs.copyFromLocalFile(false, true, jarPath, dst); 
>    FileInputFormat.setInputPaths(job, dst); 
> 
>    job.waitForCompletion(true); 
> Am I missing anything, pls help 
> 
> -- 
> You received this message because you are subscribed to the Google 

Groups "elasticsearch" group.

> To unsubscribe from this group and stop receiving emails from it, 

send an email to

>elasticsearc...@googlegroups.com <javascript:> <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:> <javascript:>>.

> To view this discussion on the web visit 
>

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>

> <

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <

https://groups.google.com/d/optout>.

-- 
Costin 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #6

If ES is not running, getting an exception is expected since one will get a connectivity error.
As for logging, make sure you properly configure log4j for your Hadoop environment - it depends on what version you are
using and what libraries.
If the job is complete, you can always tests the results by querying ES for the data that was just indexed.

On 4/28/14 9:20 AM, Chetana wrote:

I am using now ES 1.1.1 and as before es-hadoop 1.3.0.M3. But even with latest ES I am unable to index.
I don't see any log pertaining to es-hadoop in any of the log file and also not on console. Also there is no exception
while running the job and the job completes successfully
But if elasticsearch server is not running, the job throws an exception and does not run
/hadoop-yarn/staging/user1/.staging/job_1398663730568_0001
14/04/28 11:18:27 ERROR security.UserGroupInformation: PriviledgedActionException as:user1 (auth:SIMPLE)
cause:java.io.IOException: Out of nodes and retries; caught exception
Exception in thread "main" java.io.IOException: Out of nodes and retries; caught exception
at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)
Please someone suggest me how to troubleshoot this issue. After this excise I need to index user log content to ES
Thanks,

On Thursday, April 24, 2014 3:38:38 PM UTC+5:30, Costin Leau wrote:

Try es-hadoop 1.3.0.M3 - if you enable logging as indicated in the docs I've mentioned you should be seeing all the
activity - namely the connection being made, the data being transmitted, the reply, etc....

On 4/24/14 12:21 PM, Chetana wrote:
> Yes, I am able to search but not able to index. I am using 1.3.0.M2. The json file size is just 28bytes.
> I corrected port/host  and some log setting. With all these chnages I am to run the application, but the indexing is
> still not happening
> I dont see any error messages in log files.
> I am using a single node cluster and running both elastic search and hadoop on the same system.
>
> On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau wrote:
>
>     Hi,
>
>     If I understand correctly, you can read data from ES (through es-hadoop) but you cannot write to it - am I correct? Can
>     you confirm that you are using the latest es-hadoop, namely 1.3.0.M3?
>     How big is the JSON file you are trying to index? Do you see any activity in the console?
>
>     There are various ways in which you can monitor activity - in ES you can monitor the console or use Marvel [1], in
>     ES-hadoop you can enable logging [3] and see how the job progresses. Try starting with a small file to have a short
>     feedback loop and once things get ironed out, try your actual desired file.
>
>
>     A few notes:
>     - In general we recommend using the latest stable version of Elasticsearch - 0.90.2 is quite old and unless you have a
>     strong reason to stay on it, I highly recommend upgrading to 1.1.1 or, in the worst case scenario, ES 0.90.13.
>     - you have repeating code - you set "es.nodes" and "es.port" twice
>     - you are copying the local file to the destination filesystem (presumably HDFS) which is okay but typically this is
>     done outside the job launch
>     - if you are using Hadoop MRv2, consider switching to MRv1. es-hadoop supports both modes but the latter is easier to
>     use and still the one the vendors recommend by default
>     - since you are using Hadoop 2 from Hortonworks, you might want to upgrade to their latest HDP (2.1) release.
>
>     Hope this helps,
>
>     [1]http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>
<http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>>
>     [2]http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>
>     <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>>
>
>     On 4/24/14 7:30 AM, Chetana wrote:
>     > I am using elasticsearch-hadoop plugin (https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>) and
>     trying to index some
>     > documents. Iam using elasticsearch version 0.90.2 and Hadoop Hortonworks 2.2.0. The search functionality works fine, but
>     > while indexing application hangs
>     >https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>
>     > The json file location is passed as a command line arguemnt and below is the indexing code snippet
>     > Configuration conf = new Configuration();
>     >    conf.setBoolean("mapred.map.tasks.speculative.execution", false);
>     >    conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
>     >    conf.setInt("mapred.min.split.size",40);
>     >    conf.set("es.resource", "test/test");
>     >    conf.set("es.nodes", "localhost");
>     >    conf.set("es.port", "9200");
>     >    conf.set("es.input.json", "yes");
>     >    conf.set("es.nodes", "localhost");
>     >    conf.set("es.port", "9200");
>     >
>     >    Job job = Job.getInstance(conf);
>     >    job.setMapperClass(Mapper.class);
>     >    job.setInputFormatClass(TextInputFormat.class);
>     >    job.setOutputFormatClass(EsOutputFormat.class);
>     >    job.setMapOutputKeyClass(LongWritable.class);
>     >    job.setMapOutputValueClass(Text.class);
>     > Path jarPath = new Path(args[0]);
>     > FileSystem fs = FileSystem.get(conf);
>     >      Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
>     > fs.copyFromLocalFile(false, true, jarPath, dst);
>     >    FileInputFormat.setInputPaths(job, dst);
>     >
>     >    job.waitForCompletion(true);
>     > Am I missing anything, pls help
>     >
>     > --
>     > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
>     > To unsubscribe from this group and stop receiving emails from it, send an email to
>     >elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
>     > To view this discussion on the web visit
>     >https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>>
>     > <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>>.

>
>     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout> <https://groups.google.com/d/optout
<https://groups.google.com/d/optout>>.
>
>     --
>     Costin
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/535DFDA8.80103%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Chetana) #7

I have verified the index repository by querying, so said unable to index
even though there are no error

On Monday, April 28, 2014 12:35:12 PM UTC+5:30, Costin Leau wrote:

If ES is not running, getting an exception is expected since one will get
a connectivity error.
As for logging, make sure you properly configure log4j for your Hadoop
environment - it depends on what version you are
using and what libraries.
If the job is complete, you can always tests the results by querying ES
for the data that was just indexed.

On 4/28/14 9:20 AM, Chetana wrote:

I am using now ES 1.1.1 and as before es-hadoop 1.3.0.M3. But even with
latest ES I am unable to index.
I don't see any log pertaining to es-hadoop in any of the log file and
also not on console. Also there is no exception
while running the job and the job completes successfully
But if elasticsearch server is not running, the job throws an exception
and does not run
/hadoop-yarn/staging/user1/.staging/job_1398663730568_0001
14/04/28 11:18:27 ERROR security.UserGroupInformation:
PriviledgedActionException as:user1 (auth:SIMPLE)
cause:java.io.IOException: Out of nodes and retries; caught exception
Exception in thread "main" java.io.IOException: Out of nodes and
retries; caught exception
at
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
at
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
at
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
at
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)
Please someone suggest me how to troubleshoot this issue. After this
excise I need to index user log content to ES
Thanks,

On Thursday, April 24, 2014 3:38:38 PM UTC+5:30, Costin Leau wrote:

Try es-hadoop 1.3.0.M3 - if you enable logging as indicated in the 

docs I've mentioned you should be seeing all the

activity - namely the connection being made, the data being 

transmitted, the reply, etc....

On 4/24/14 12:21 PM, Chetana wrote: 
> Yes, I am able to search but not able to index. I am using 

1.3.0.M2. The json file size is just 28bytes.

> I corrected port/host  and some log setting. With all these 

chnages I am to run the application, but the indexing is

> still not happening 
> I dont see any error messages in log files. 
> I am using a single node cluster and running both elastic search 

and hadoop on the same system.

> 
> On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau 

wrote:

> 
>     Hi, 
> 
>     If I understand correctly, you can read data from ES (through 

es-hadoop) but you cannot write to it - am I correct? Can

>     you confirm that you are using the latest es-hadoop, namely 

1.3.0.M3?

>     How big is the JSON file you are trying to index? Do you see 

any activity in the console?

> 
>     There are various ways in which you can monitor activity - in 

ES you can monitor the console or use Marvel [1], in

>     ES-hadoop you can enable logging [3] and see how the job 

progresses. Try starting with a small file to have a short

>     feedback loop and once things get ironed out, try your actual 

desired file.

> 
> 
>     A few notes: 
>     - In general we recommend using the latest stable version of 

Elasticsearch - 0.90.2 is quite old and unless you have a

>     strong reason to stay on it, I highly recommend upgrading to 

1.1.1 or, in the worst case scenario, ES 0.90.13.

>     - you have repeating code - you set "es.nodes" and "es.port" 

twice

>     - you are copying the local file to the destination filesystem 

(presumably HDFS) which is okay but typically this is

>     done outside the job launch 
>     - if you are using Hadoop MRv2, consider switching to MRv1. 

es-hadoop supports both modes but the latter is easier to

>     use and still the one the vendors recommend by default 
>     - since you are using Hadoop 2 from Hortonworks, you might 

want to upgrade to their latest HDP (2.1) release.

> 
>     Hope this helps, 
> 
>     [1]http://www.elasticsearch.org/overview/marvel/ <

http://www.elasticsearch.org/overview/marvel/>

<http://www.elasticsearch.org/overview/marvel/ <

http://www.elasticsearch.org/overview/marvel/>>

>     [2]

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

<

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>

>     <

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

<

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>>

> 
>     On 4/24/14 7:30 AM, Chetana wrote: 
>     > I am using elasticsearch-hadoop plugin (

https://github.com/elasticsearch/elasticsearch-hadoop <
https://github.com/elasticsearch/elasticsearch-hadoop>

<https://github.com/elasticsearch/elasticsearch-hadoop <

https://github.com/elasticsearch/elasticsearch-hadoop>>) and

>     trying to index some 
>     > documents. Iam using elasticsearch version 0.90.2 and Hadoop 

Hortonworks 2.2.0. The search functionality works fine, but

>     > while indexing application hangs 
>     >https://github.com/elasticsearch/elasticsearch-hadoop <

https://github.com/elasticsearch/elasticsearch-hadoop>

<https://github.com/elasticsearch/elasticsearch-hadoop <

https://github.com/elasticsearch/elasticsearch-hadoop>>

>     > The json file location is passed as a command line arguemnt 

and below is the indexing code snippet

>     > Configuration conf = new Configuration(); 
>     >    conf.setBoolean("mapred.map.tasks.speculative.execution", 

false);

>     >   

conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);

>     >    conf.setInt("mapred.min.split.size",40); 
>     >    conf.set("es.resource", "test/test"); 
>     >    conf.set("es.nodes", "localhost"); 
>     >    conf.set("es.port", "9200"); 
>     >    conf.set("es.input.json", "yes"); 
>     >    conf.set("es.nodes", "localhost"); 
>     >    conf.set("es.port", "9200"); 
>     > 
>     >    Job job = Job.getInstance(conf); 
>     >    job.setMapperClass(Mapper.class); 
>     >    job.setInputFormatClass(TextInputFormat.class); 
>     >    job.setOutputFormatClass(EsOutputFormat.class); 
>     >    job.setMapOutputKeyClass(LongWritable.class); 
>     >    job.setMapOutputValueClass(Text.class); 
>     > Path jarPath = new Path(args[0]); 
>     > FileSystem fs = FileSystem.get(conf); 
>     >      Path dst = new Path(fs.getHomeDirectory(), 

jarPath.getName());

>     > fs.copyFromLocalFile(false, true, jarPath, dst); 
>     >    FileInputFormat.setInputPaths(job, dst); 
>     > 
>     >    job.waitForCompletion(true); 
>     > Am I missing anything, pls help 
>     > 
>     > -- 
>     > You received this message because you are subscribed to the 

Google Groups "elasticsearch" group.

>     > To unsubscribe from this group and stop receiving emails 

from it, send an email to

>     >elasticsearc...@googlegroups.com <javascript:> <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:> <javascript:>

<javascript:>>. 
>     > To view this discussion on the web visit 
>     >

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>

>     <

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>>

>     > <

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>>.

> 
>     > For more options, visithttps://groups.google.com/d/optout <

http://groups.google.com/d/optout> <https://groups.google.com/d/optout

<https://groups.google.com/d/optout>>. 
> 
>     -- 
>     Costin 
> 
> -- 
> You received this message because you are subscribed to the Google 

Groups "elasticsearch" group.

> To unsubscribe from this group and stop receiving emails from it, 

send an email to

>elasticsearc...@googlegroups.com <javascript:> <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:> <javascript:>>.

> To view this discussion on the web visit 
>

https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com>

> <

https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <

https://groups.google.com/d/optout>.

-- 
Costin 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11f042f0-2c5b-46d0-973e-3720662ed7bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #8

Sorry, I'm not following.
Who said "unable to index" ? By querying I meant reading data from Elasticsearch not writing (aka indexing).

Even if you don't configure logging [1], if you launch your job with waitForComplete(true) you'll get the Es-Hadoop
statistics at the end of it:

Elasticsearch Hadoop Counters
Scroll Reads=0
Bytes Accepted=208632
Bytes Retried=0
Bulk Writes=26
Documents Read=0
Documents Written=988
Bytes Read=90728
Documents Retried=0
Bulk Total Time(ms)=298
Bytes Written=208632
Bulk Retries=0
Documents Accepted=988
Network Retries=0
Network Total Time(ms)=451
Scroll Total Time(ms)=0
Bulk Retries Total Time(ms)=0
Node Retries=0

If none of this information appears in your console then you likely have some issues with your job configuration.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html

On 4/28/14 12:22 PM, Chetana wrote:

I have verified the index repository by querying, so said unable to index even though there are no error

On Monday, April 28, 2014 12:35:12 PM UTC+5:30, Costin Leau wrote:

If ES is not running, getting an exception is expected since one will get a connectivity error.
As for logging, make sure you properly configure log4j for your Hadoop environment - it depends on what version you are
using and what libraries.
If the job is complete, you can always tests the results by querying ES for the data that was just indexed.

On 4/28/14 9:20 AM, Chetana wrote:
> I am using now ES 1.1.1 and as before es-hadoop 1.3.0.M3. But even with latest ES I am unable to index.
> I don't see any log pertaining to es-hadoop in any of the log file and also not on console. Also there is no exception
> while running the job and the job completes successfully
> But if elasticsearch server is not running, the  job throws an exception and does not run
> /hadoop-yarn/staging/user1/.staging/job_1398663730568_0001
> 14/04/28 11:18:27 ERROR security.UserGroupInformation: PriviledgedActionException as:user1 (auth:SIMPLE)
> cause:java.io.IOException: Out of nodes and retries; caught exception
> Exception in thread "main" java.io.IOException: Out of nodes and retries; caught exception
>   at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
>   at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
>   at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
>   at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
>   at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)
> Please someone suggest me how to troubleshoot this issue. After this excise I need to index user log content to ES
> Thanks,
>
> On Thursday, April 24, 2014 3:38:38 PM UTC+5:30, Costin Leau wrote:
>
>     Try es-hadoop 1.3.0.M3 - if you enable logging as indicated in the docs I've mentioned you should be seeing all the
>     activity - namely the connection being made, the data being transmitted, the reply, etc....
>
>     On 4/24/14 12:21 PM, Chetana wrote:
>     > Yes, I am able to search but not able to index. I am using 1.3.0.M2. The json file size is just 28bytes.
>     > I corrected port/host  and some log setting. With all these chnages I am to run the application, but the indexing is
>     > still not happening
>     > I dont see any error messages in log files.
>     > I am using a single node cluster and running both elastic search and hadoop on the same system.
>     >
>     > On Thursday, April 24, 2014 10:30:22 AM UTC+5:30, Costin Leau wrote:
>     >
>     >     Hi,
>     >
>     >     If I understand correctly, you can read data from ES (through es-hadoop) but you cannot write to it - am I correct? Can
>     >     you confirm that you are using the latest es-hadoop, namely 1.3.0.M3?
>     >     How big is the JSON file you are trying to index? Do you see any activity in the console?
>     >
>     >     There are various ways in which you can monitor activity - in ES you can monitor the console or use Marvel [1], in
>     >     ES-hadoop you can enable logging [3] and see how the job progresses. Try starting with a small file to have a short
>     >     feedback loop and once things get ironed out, try your actual desired file.
>     >
>     >
>     >     A few notes:
>     >     - In general we recommend using the latest stable version of Elasticsearch - 0.90.2 is quite old and unless you have a
>     >     strong reason to stay on it, I highly recommend upgrading to 1.1.1 or, in the worst case scenario, ES 0.90.13.
>     >     - you have repeating code - you set "es.nodes" and "es.port" twice
>     >     - you are copying the local file to the destination filesystem (presumably HDFS) which is okay but typically this is
>     >     done outside the job launch
>     >     - if you are using Hadoop MRv2, consider switching to MRv1. es-hadoop supports both modes but the latter is easier to
>     >     use and still the one the vendors recommend by default
>     >     - since you are using Hadoop 2 from Hortonworks, you might want to upgrade to their latest HDP (2.1) release.
>     >
>     >     Hope this helps,
>     >
>     >     [1]http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>
<http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>>
>     <http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>
<http://www.elasticsearch.org/overview/marvel/ <http://www.elasticsearch.org/overview/marvel/>>>
>     >     [2]http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>
>     <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>>
>     >     <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>
>     <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/logging.html>>>
>     >
>     >     On 4/24/14 7:30 AM, Chetana wrote:
>     >     > I am using elasticsearch-hadoop plugin (https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>
>     <https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>>) and
>     >     trying to index some
>     >     > documents. Iam using elasticsearch version 0.90.2 and Hadoop Hortonworks 2.2.0. The search functionality works fine, but
>     >     > while indexing application hangs
>     >     >https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>
>     <https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>
<https://github.com/elasticsearch/elasticsearch-hadoop <https://github.com/elasticsearch/elasticsearch-hadoop>>>
>     >     > The json file location is passed as a command line arguemnt and below is the indexing code snippet
>     >     > Configuration conf = new Configuration();
>     >     >    conf.setBoolean("mapred.map.tasks.speculative.execution", false);
>     >     >    conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
>     >     >    conf.setInt("mapred.min.split.size",40);
>     >     >    conf.set("es.resource", "test/test");
>     >     >    conf.set("es.nodes", "localhost");
>     >     >    conf.set("es.port", "9200");
>     >     >    conf.set("es.input.json", "yes");
>     >     >    conf.set("es.nodes", "localhost");
>     >     >    conf.set("es.port", "9200");
>     >     >
>     >     >    Job job = Job.getInstance(conf);
>     >     >    job.setMapperClass(Mapper.class);
>     >     >    job.setInputFormatClass(TextInputFormat.class);
>     >     >    job.setOutputFormatClass(EsOutputFormat.class);
>     >     >    job.setMapOutputKeyClass(LongWritable.class);
>     >     >    job.setMapOutputValueClass(Text.class);
>     >     > Path jarPath = new Path(args[0]);
>     >     > FileSystem fs = FileSystem.get(conf);
>     >     >      Path dst = new Path(fs.getHomeDirectory(), jarPath.getName());
>     >     > fs.copyFromLocalFile(false, true, jarPath, dst);
>     >     >    FileInputFormat.setInputPaths(job, dst);
>     >     >
>     >     >    job.waitForCompletion(true);
>     >     > Am I missing anything, pls help
>     >     >
>     >     > --
>     >     > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
>     >     > To unsubscribe from this group and stop receiving emails from it, send an email to
>     >     >elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>
>     <javascript:>>.
>     >     > To view this discussion on the web visit
>     >     >https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>>
>     >     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com>>>
>     >     > <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>

>
>     >     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/15b0d1e9-e258-4fd4-b7bc-d7b81596fc81%40googlegroups.com?utm_medium=email&utm_source=footer>>>>.

>
>     >
>     >     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout> <http://groups.google.com/d/optout
<http://groups.google.com/d/optout>> <https://groups.google.com/d/optout <https://groups.google.com/d/optout>
>     <https://groups.google.com/d/optout <https://groups.google.com/d/optout>>>.
>     >
>     >     --
>     >     Costin
>     >
>     > --
>     > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
>     > To unsubscribe from this group and stop receiving emails from it, send an email to
>     >elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
>     > To view this discussion on the web visit
>     >https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com>>
>     > <https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/58142f74-93a0-4899-9ef6-1dec468b6c39%40googlegroups.com?utm_medium=email&utm_source=footer>>>.

>
>     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout> <https://groups.google.com/d/optout
<https://groups.google.com/d/optout>>.
>
>     --
>     Costin
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/6397166a-c3f2-41a5-9ce7-50b09c237ad9%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/11f042f0-2c5b-46d0-973e-3720662ed7bc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/11f042f0-2c5b-46d0-973e-3720662ed7bc%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/535E237E.2030107%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #9