ByteBufferFile has a NPE


(James Cook) #1

I had posted this problem on IRC without any resolution in case you sense
deja vu.

Using ES 0.13.0 on Mac OSX.

I have a bootstrap process which inserts documents into a single index
(several different types) as my application initializes. When I set the
index.type to niofs, everything works as expected.

However, when I use index.type = memory, things fail with the following
exception:

13:24:37,458 DEBUG ker #2-1 org.elasticsearch.action.get: 74 -
[Tempest] Index Shard [nep][3]: Failed to get
[activityStreams#537f083e2d2a4bb4a83400dc4890a271]
org.elasticsearch.transport.RemoteTransportException:
[Harpy][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
Caused by: java.lang.NullPointerException
at
org.elasticsearch.index.store.memory.ByteBufferFile.numberOfBuffers(ByteBufferFile.java:62)
at
org.elasticsearch.index.store.memory.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:92)
at
org.elasticsearch.index.store.memory.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:82)
at org.apache.lucene.index.SegmentTermEnum.seek(SegmentTermEnum.java:114)
at
org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:172)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:231)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1216)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1148)
at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
at
org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

Tracing through the code, I can see where ByteBufferFile.numberOfBuffers is
called, but this.buffers is null! This results in a NPE when the following
function is called.

public class ByteBufferFile {
....
private volatile ByteBuffer[] buffers;
....
int numberOfBuffers() {
return this.buffers.length;
}
....
}

I wish I could create a curl gist to reproduce, but this happens deep in a
large initialization routine.


(Shay Banon) #2

Heya,

Can you try with 0.13.1. I made an assumption (lucene wise) in the implementation of the Directory that was broken in 0.13.0 (older lucene version). This assumption basically says that while a "writer" to a file is open, there is no "reader" to that file. It might not hold forever (I will remove this assumption) since future development of lucene, it might happen.

Is it something that you can reproduce easily on your end (just to see if its easy to verify a fix).

-shay.banon
On Friday, December 10, 2010 at 8:36 PM, James Cook wrote:

I had posted this problem on IRC without any resolution in case you sense deja vu.

Using ES 0.13.0 on Mac OSX.

I have a bootstrap process which inserts documents into a single index (several different types) as my application initializes. When I set the index.type to niofs, everything works as expected.

However, when I use index.type = memory, things fail with the following exception:

13:24:37,458 DEBUG ker #2-1 org.elasticsearch.action.get: 74 - [Tempest] Index Shard [nep][3]: Failed to get [activityStreams#537f083e2d2a4bb4a83400dc4890a271]
org.elasticsearch.transport.RemoteTransportException: [Harpy][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
Caused by: java.lang.NullPointerException
at org.elasticsearch.index.store.memory.ByteBufferFile.numberOfBuffers(ByteBufferFile.java:62)
at org.elasticsearch.index.store.memory.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:92)
at org.elasticsearch.index.store.memory.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:82)
at org.apache.lucene.index.SegmentTermEnum.seek(SegmentTermEnum.java:114)
at org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:172)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:231)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
at org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1216)
at org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1148)
at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
at org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
at org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
at org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
at org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
at org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

Tracing through the code, I can see where ByteBufferFile.numberOfBuffers is called, but this.buffers is null! This results in a NPE when the following function is called.

public class ByteBufferFile {
....
private volatile ByteBuffer[] buffers;

....
int numberOfBuffers() {
return this.buffers.length;
}

....
}

I wish I could create a curl gist to reproduce, but this happens deep in a large initialization routine.


(James Cook) #3

Hi Shay,

The same exception occurs in 0.13.1.

The only thing I noticed that is different, is I now I have a new exception.
I am not sure if they are related.

14:32:39,061 ERROR thread-1 rchListenerContainer: 95 - Error while loading
key: e8fbb3dfe2b54371b8dcbfd5d80aad87
org.elasticsearch.transport.RemoteTransportException: [Lord Dark
Wind][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
Caused by: java.lang.IndexOutOfBoundsException: Index: 100, Size: 39
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:285)
at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:274)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:86)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)
at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:162)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1224)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1156)
at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
at
org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

-- jim

On Fri, Dec 10, 2010 at 2:20 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya,

Can you try with 0.13.1. I made an assumption (lucene wise) in the
implementation of the Directory that was broken in 0.13.0 (older lucene
version). This assumption basically says that while a "writer" to a file is
open, there is no "reader" to that file. It might not hold forever (I will
remove this assumption) since future development of lucene, it might happen.

Is it something that you can reproduce easily on your end (just to see
if its easy to verify a fix).

-shay.banon

On Friday, December 10, 2010 at 8:36 PM, James Cook wrote:

I had posted this problem on IRC without any resolution in case you sense
deja vu.

Using ES 0.13.0 on Mac OSX.

I have a bootstrap process which inserts documents into a single index
(several different types) as my application initializes. When I set the
index.type to niofs, everything works as expected.

However, when I use index.type = memory, things fail with the following
exception:

13:24:37,458 DEBUG ker #2-1 org.elasticsearch.action.get: 74 -
[Tempest] Index Shard [nep][3]: Failed to get
[activityStreams#537f083e2d2a4bb4a83400dc4890a271]
org.elasticsearch.transport.RemoteTransportException:
[Harpy][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
Caused by: java.lang.NullPointerException
at
org.elasticsearch.index.store.memory.ByteBufferFile.numberOfBuffers(ByteBufferFile.java:62)
at
org.elasticsearch.index.store.memory.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:92)
at
org.elasticsearch.index.store.memory.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:82)
at org.apache.lucene.index.SegmentTermEnum.seek(SegmentTermEnum.java:114)
at
org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:172)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:231)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1216)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1148)
at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
at
org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

Tracing through the code, I can see where ByteBufferFile.numberOfBuffers is
called, but this.buffers is null! This results in a NPE when the following
function is called.

public class ByteBufferFile {
....
private volatile ByteBuffer[] buffers;
....
int numberOfBuffers() {
return this.buffers.length;
}
....
}

I wish I could create a curl gist to reproduce, but this happens deep in a
large initialization routine.


(Shay Banon) #4

Heya,

Tried to recreate this and still did not manage to. Is there a chance that
you can try and work on a (as simple as possible) recreation?

-shay.banon

On Fri, Dec 10, 2010 at 9:35 PM, James Cook jcook@tracermedia.com wrote:

Hi Shay,

The same exception occurs in 0.13.1.

The only thing I noticed that is different, is I now I have a new
exception. I am not sure if they are related.

14:32:39,061 ERROR thread-1 rchListenerContainer: 95 - Error while loading
key: e8fbb3dfe2b54371b8dcbfd5d80aad87
org.elasticsearch.transport.RemoteTransportException: [Lord Dark
Wind][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
Caused by: java.lang.IndexOutOfBoundsException: Index: 100, Size: 39
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:285)
at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:274)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:86)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)
at
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:162)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1224)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1156)
at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
at
org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
at
org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
at
org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

-- jim

On Fri, Dec 10, 2010 at 2:20 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya,

Can you try with 0.13.1. I made an assumption (lucene wise) in the
implementation of the Directory that was broken in 0.13.0 (older lucene
version). This assumption basically says that while a "writer" to a file is
open, there is no "reader" to that file. It might not hold forever (I will
remove this assumption) since future development of lucene, it might happen.

Is it something that you can reproduce easily on your end (just to see
if its easy to verify a fix).

-shay.banon

On Friday, December 10, 2010 at 8:36 PM, James Cook wrote:

I had posted this problem on IRC without any resolution in case you sense
deja vu.

Using ES 0.13.0 on Mac OSX.

I have a bootstrap process which inserts documents into a single index
(several different types) as my application initializes. When I set the
index.type to niofs, everything works as expected.

However, when I use index.type = memory, things fail with the following
exception:

13:24:37,458 DEBUG ker #2-1 org.elasticsearch.action.get: 74

  • [Tempest] Index Shard [nep][3]: Failed to get
    [activityStreams#537f083e2d2a4bb4a83400dc4890a271]
    org.elasticsearch.transport.RemoteTransportException:
    [Harpy][inet[/fe80:0:0:0:223:12ff:fe1e:c043%5:9310]][indices/get/shard]
    Caused by: java.lang.NullPointerException
    at
    org.elasticsearch.index.store.memory.ByteBufferFile.numberOfBuffers(ByteBufferFile.java:62)
    at
    org.elasticsearch.index.store.memory.ByteBufferIndexInput.switchCurrentBuffer(ByteBufferIndexInput.java:92)
    at
    org.elasticsearch.index.store.memory.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:82)
    at
    org.apache.lucene.index.SegmentTermEnum.seek(SegmentTermEnum.java:114)
    at
    org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:172)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:231)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
    at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
    at
    org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1216)
    at
    org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1148)
    at org.elasticsearch.common.lucene.Lucene.docId(Lucene.java:65)
    at
    org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:90)
    at
    org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:53)
    at
    org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:268)
    at
    org.elasticsearch.action.support.single.TransportSingleOperationAction$ShardTransportHandler.messageReceived(TransportSingleOperationAction.java:261)
    at
    org.elasticsearch.transport.netty.MessageChannelHandler$3.run(MessageChannelHandler.java:195)
    at
    java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:680)

Tracing through the code, I can see where ByteBufferFile.numberOfBuffers
is called, but this.buffers is null! This results in a NPE when the
following function is called.

public class ByteBufferFile {
....
private volatile ByteBuffer[] buffers;
....
int numberOfBuffers() {
return this.buffers.length;
}
....
}

I wish I could create a curl gist to reproduce, but this happens deep in a
large initialization routine.


#5

On Mon, Dec 20, 2010 at 2:41 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

Heya,
Tried to recreate this and still did not manage to. Is there a chance that
you can try and work on a (as simple as possible) recreation?
-shay.banon

Shay, is this the same directory as
https://issues.apache.org/jira/browse/LUCENE-2292 ?

If you apply the patch there, give it a no-arg ctor, and run all the
lucene tests with this directory (ant test
-Dtests.directory=ByteBufferDirectory), it seems to expose the NPE
issue in a couple tests.


(Shay Banon) #6

Hi Robert,

Yea, this is the issue that forms the basis for this Directory
implementation. I ran the tests and actually saw that some are failing. I
pushed a new patch in LUCENE-2292, with fixes and enhancements (new
pluggable allocator), and integrated the code into ES at:
https://github.com/elasticsearch/elasticsearch/issues/issue/577.

This hopefully should be fixed now.

-shay.banon

On Wed, Dec 22, 2010 at 9:02 PM, Robert Muir rcmuir@gmail.com wrote:

On Mon, Dec 20, 2010 at 2:41 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

Heya,
Tried to recreate this and still did not manage to. Is there a chance
that
you can try and work on a (as simple as possible) recreation?
-shay.banon

Shay, is this the same directory as
https://issues.apache.org/jira/browse/LUCENE-2292 ?

If you apply the patch there, give it a no-arg ctor, and run all the
lucene tests with this directory (ant test
-Dtests.directory=ByteBufferDirectory), it seems to expose the NPE
issue in a couple tests.


(system) #7