HDFS Gateway - SendRequestTransportException


(Zaharije) #1

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at
org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:
58)
at
org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:
34)
at java.lang.Throwable.(Throwable.java:218)
at java.lang.Exception.(Exception.java:59)
at java.lang.RuntimeException.(RuntimeException.java:61)
at
org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:
46)
at
org.elasticsearch.transport.TransportException.(TransportException.java:
34)
at
org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:
39)
at
org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:
30)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:197)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)
Caused by: java.lang.NullPointerException
at org.elasticsearch.util.io.stream.BytesStreamOutput
$Cached.cachedHandles(BytesStreamOutput.java:59)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:
385)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
183)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
171)
at
org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
70)
at
org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
65)
at org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:125)
at org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:123)
at
org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:
144)
at
org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:
123)
at
org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:
223)
at com.test.App$IndexThread.index(App.java:115)
at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Shay Banon) #2

Do you get this exception consistently? Strange one...

-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije pasalic.zaharije@gmail.com wrote:

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:
58)
at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:
34)
at java.lang.Throwable.(Throwable.java:218)
at java.lang.Exception.(Exception.java:59)
at java.lang.RuntimeException.(RuntimeException.java:61)
at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:
46)
at

org.elasticsearch.transport.TransportException.(TransportException.java:
34)
at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:
39)
at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:
30)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:197)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)
Caused by: java.lang.NullPointerException
at org.elasticsearch.util.io.stream.BytesStreamOutput
$Cached.cachedHandles(BytesStreamOutput.java:59)
at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:
385)
at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
183)
at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
171)
at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
70)
at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
65)
at
org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:125)
at
org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:123)
at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:
144)
at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:
123)
at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:
223)
at com.test.App$IndexThread.index(App.java:115)
at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Zaharije) #3

Yes.

It seems that this is more HDFS related problem. After few runs i'm
getting timeout exceptions from hdfs, but when i lower my
thread/process count (creating index) than it looks fine.

In general, is it good idea to store lucene indexes into hdfs? I
thought that hdfs has really bad performance for r/w of the small
files ?

Regards
Zaharije

On Wed, Jun 9, 2010 at 4:40 PM, Shay Banon shay.banon@elasticsearch.com wrote:

Do you get this exception consistently? Strange one...
-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije pasalic.zaharije@gmail.com wrote:

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:
58)
at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:
34)
at java.lang.Throwable.(Throwable.java:218)
at java.lang.Exception.(Exception.java:59)
at java.lang.RuntimeException.(RuntimeException.java:61)
at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:
46)
at

org.elasticsearch.transport.TransportException.(TransportException.java:
34)
at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:
39)
at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:
30)
at org.elasticsearch.transport.TransportService
$2.run(TransportService.java:197)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)
Caused by: java.lang.NullPointerException
at org.elasticsearch.util.io.stream.BytesStreamOutput
$Cached.cachedHandles(BytesStreamOutput.java:59)
at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:
385)
at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
183)
at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:
171)
at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
70)
at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:
65)
at
org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:125)
at
org.elasticsearch.client.transport.support.InternalTransportClient
$1.doWithNode(InternalTransportClient.java:123)
at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:
144)
at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:
123)
at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:
223)
at com.test.App$IndexThread.index(App.java:115)
at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Shay Banon) #4

The HDFS support is done in the gateway, the actual indices that
elasticsearch works with are stored on the local file system (by default). I
talk about the gateway role in elasticsearch here:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machine.html.

-shay.banon

On Sat, Jun 12, 2010 at 12:55 AM, Zaharije Pasalic <
pasalic.zaharije@gmail.com> wrote:

Yes.

It seems that this is more HDFS related problem. After few runs i'm
getting timeout exceptions from hdfs, but when i lower my
thread/process count (creating index) than it looks fine.

In general, is it good idea to store lucene indexes into hdfs? I
thought that hdfs has really bad performance for r/w of the small
files ?

Regards
Zaharije

On Wed, Jun 9, 2010 at 4:40 PM, Shay Banon shay.banon@elasticsearch.com
wrote:

Do you get this exception consistently? Strange one...
-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije pasalic.zaharije@gmail.com
wrote:

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:

  1. at java.lang.Throwable.(Throwable.java:218)
    at java.lang.Exception.(Exception.java:59)
    at java.lang.RuntimeException.(RuntimeException.java:61)
    at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:

  1. at

org.elasticsearch.transport.TransportException.(TransportException.java:

  1. at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:

  1. at org.elasticsearch.transport.TransportService
    $2.run(TransportService.java:197)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:637)
    Caused by: java.lang.NullPointerException
    at org.elasticsearch.util.io.stream.BytesStreamOutput
    $Cached.cachedHandles(BytesStreamOutput.java:59)
    at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:125)
    at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:123)
    at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:

  1. at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:

  1. at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:

  1. at com.test.App$IndexThread.index(App.java:115)
    at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Otis Gospodnetić) #5

Bok Zaharije,

For what it's worth, and just to state it explicitly: HDFS is great
for storing indices because it automatically creates their replicas,
so one doesn't have to make backups. But, yes, you shouldn't search
indices stored in HDFS.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

On Jun 12, 12:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

The HDFS support is done in the gateway, the actual indices that
elasticsearch works with are stored on the local file system (by default). I
talk about the gateway role in elasticsearch here:http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

-shay.banon

On Sat, Jun 12, 2010 at 12:55 AM, Zaharije Pasalic <

pasalic.zahar...@gmail.com> wrote:

Yes.

It seems that this is more HDFS related problem. After few runs i'm
getting timeout exceptions from hdfs, but when i lower my
thread/process count (creating index) than it looks fine.

In general, is it good idea to store lucene indexes into hdfs? I
thought that hdfs has really bad performance for r/w of the small
files ?

Regards
Zaharije

On Wed, Jun 9, 2010 at 4:40 PM, Shay Banon shay.ba...@elasticsearch.com
wrote:

Do you get this exception consistently? Strange one...
-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije pasalic.zahar...@gmail.com
wrote:

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:

  1. at java.lang.Throwable.(Throwable.java:218)
    at java.lang.Exception.(Exception.java:59)
    at java.lang.RuntimeException.(RuntimeException.java:61)
    at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:

  1. at

org.elasticsearch.transport.TransportException.(TransportException.java:

  1. at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:

  1. at org.elasticsearch.transport.TransportService
    $2.run(TransportService.java:197)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:637)
    Caused by: java.lang.NullPointerException
    at org.elasticsearch.util.io.stream.BytesStreamOutput
    $Cached.cachedHandles(BytesStreamOutput.java:59)
    at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:125)
    at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:123)
    at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:

  1. at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:

  1. at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:

  1. at com.test.App$IndexThread.index(App.java:115)
    at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Zaharije) #6

So if I understand it:

Gateway is used for long time persistence storage and i can control it
via gateway modules (like hadoop, fs, cloud). And index gateway is
user for per node (is it in fact per shard?) level.

So if i had enough memory i can user let say hdfs for long term
storage, and memory for shard level?

thx
Zaharije

On Mon, Jun 14, 2010 at 4:01 AM, Otis otis.gospodnetic@gmail.com wrote:

Bok Zaharije,

For what it's worth, and just to state it explicitly: HDFS is great
for storing indices because it automatically creates their replicas,
so one doesn't have to make backups. But, yes, you shouldn't search
indices stored in HDFS.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

On Jun 12, 12:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

The HDFS support is done in the gateway, the actual indices that
elasticsearch works with are stored on the local file system (by default). I
talk about the gateway role in elasticsearch here:http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

-shay.banon

On Sat, Jun 12, 2010 at 12:55 AM, Zaharije Pasalic <

pasalic.zahar...@gmail.com> wrote:

Yes.

It seems that this is more HDFS related problem. After few runs i'm
getting timeout exceptions from hdfs, but when i lower my
thread/process count (creating index) than it looks fine.

In general, is it good idea to store lucene indexes into hdfs? I
thought that hdfs has really bad performance for r/w of the small
files ?

Regards
Zaharije

On Wed, Jun 9, 2010 at 4:40 PM, Shay Banon shay.ba...@elasticsearch.com
wrote:

Do you get this exception consistently? Strange one...
-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije pasalic.zahar...@gmail.com
wrote:

Hi

i'm writing small multi threaded app to index some random data (each
thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes i'm
getting following exception on client side (im using TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder Man]
[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:

  1. at java.lang.Throwable.(Throwable.java:218)
    at java.lang.Exception.(Exception.java:59)
    at java.lang.RuntimeException.(RuntimeException.java:61)
    at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:

  1. at

org.elasticsearch.transport.TransportException.(TransportException.java:

  1. at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:

  1. at org.elasticsearch.transport.TransportService
    $2.run(TransportService.java:197)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:637)
    Caused by: java.lang.NullPointerException
    at org.elasticsearch.util.io.stream.BytesStreamOutput
    $Cached.cachedHandles(BytesStreamOutput.java:59)
    at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:125)
    at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:123)
    at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:

  1. at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:

  1. at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:

  1. at com.test.App$IndexThread.index(App.java:115)
    at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors in
namenode.
Any ideas?

Regards
Zaharije


(Shay Banon) #7

Gateway is gateway, and its used for long term storage. The index gateway is
simply the module responsible for persisting the index data to the gateway.
Its there mainly so you can configure, for example, global file system
gateway, and disable it for a specific index.

Each index (as you correctly said, each shard), stores its state "locally",
this is the store module, and it can be either file system, memory, or a
combination.

-shay.banon

On Wed, Jun 16, 2010 at 11:52 AM, Zaharije Pasalic <
pasalic.zaharije@gmail.com> wrote:

So if I understand it:

Gateway is used for long time persistence storage and i can control it
via gateway modules (like hadoop, fs, cloud). And index gateway is
user for per node (is it in fact per shard?) level.

So if i had enough memory i can user let say hdfs for long term
storage, and memory for shard level?

thx
Zaharije

On Mon, Jun 14, 2010 at 4:01 AM, Otis otis.gospodnetic@gmail.com wrote:

Bok Zaharije,

For what it's worth, and just to state it explicitly: HDFS is great
for storing indices because it automatically creates their replicas,
so one doesn't have to make backups. But, yes, you shouldn't search
indices stored in HDFS.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

On Jun 12, 12:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

The HDFS support is done in the gateway, the actual indices that
elasticsearch works with are stored on the local file system (by
default). I

talk about the gateway role in elasticsearch here:
http://www.elasticsearch.com/blog/2010/02/16/searchengine_time_machin....

-shay.banon

On Sat, Jun 12, 2010 at 12:55 AM, Zaharije Pasalic <

pasalic.zahar...@gmail.com> wrote:

Yes.

It seems that this is more HDFS related problem. After few runs i'm
getting timeout exceptions from hdfs, but when i lower my
thread/process count (creating index) than it looks fine.

In general, is it good idea to store lucene indexes into hdfs? I
thought that hdfs has really bad performance for r/w of the small
files ?

Regards
Zaharije

On Wed, Jun 9, 2010 at 4:40 PM, Shay Banon <
shay.ba...@elasticsearch.com>

wrote:

Do you get this exception consistently? Strange one...
-shay.banon

On Tue, Jun 8, 2010 at 6:54 PM, Zaharije <
pasalic.zahar...@gmail.com>

wrote:

Hi

i'm writing small multi threaded app to index some random data
(each

thread has it's own client). With default gateway it works fine. I
indexed 10M documents in about 90min.

ES cluster is small one - 8 nodes with 4 cores + 4GB memory.

When I installed hdfs gateway, i got strange behavior - sometimes
i'm

getting following exception on client side (im using
TransportClient):

Exception in thread "Thread-7"
org.elasticsearch.transport.SendRequestTransportException: [Wonder
Man]

[inet[/172.17.12.1:9300]][indices/index/shard/index]
at

org.elasticsearch.transport.RemoteTransportException.fillStack(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.fillInStackTrace(SendRequestTransportException.java:

  1. at java.lang.Throwable.(Throwable.java:218)
    at java.lang.Exception.(Exception.java:59)
    at
    java.lang.RuntimeException.(RuntimeException.java:61)
   at

org.elasticsearch.ElasticSearchException.(ElasticSearchException.java:

  1. at

org.elasticsearch.transport.TransportException.(TransportException.java:

  1. at

org.elasticsearch.transport.RemoteTransportException.(RemoteTransportException.java:

  1. at

org.elasticsearch.transport.SendRequestTransportException.(SendRequestTransportException.java:

  1. at org.elasticsearch.transport.TransportService
    $2.run(TransportService.java:197)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:637)
    Caused by: java.lang.NullPointerException
    at org.elasticsearch.util.io.stream.BytesStreamOutput
    $Cached.cachedHandles(BytesStreamOutput.java:59)
    at

org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at

org.elasticsearch.client.transport.action.support.BaseClientTransportAction.execute(BaseClientTransportAction.java:

  1. at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:125)
    at
    org.elasticsearch.client.transport.support.InternalTransportClient
    $1.doWithNode(InternalTransportClient.java:123)
    at

org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:

  1. at

org.elasticsearch.client.transport.support.InternalTransportClient.index(InternalTransportClient.java:

  1. at

org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:

  1. at com.test.App$IndexThread.index(App.java:115)
    at com.test.App$IndexThread.run(App.java:34)

HDFS is also clustered across 13 nodes, and i do not see any errors
in

namenode.
Any ideas?

Regards
Zaharije


(system) #8