Bulk insert error?

All,

I'm getting a strange error (this is from the elasticsearch log, I just get
the out of bounds error back via the REST interface) when doing a number of
large bulk inserts, any idea on what might possibly be causing this? It's on
a create operation (and there definitely is a lot of data going in)

java.lang.ArrayIndexOutOfBoundsException
at
org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:442)
at
org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:649)
at
org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:167)
at
org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:50)
at
org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
at
org.apache.lucene.analysis.StopFilter.incrementToken(StopFilter.java:141)
at
org.elasticsearch.common.lucene.all.AllTokenStream.incrementToken(AllTokenStream.java:56)
at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:188)
at
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:246)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:826)
at
org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:802)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1998)
at
org.elasticsearch.index.engine.robin.RobinEngine.innerCreate(RobinEngine.java:305)
at
org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:229)
at
org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:264)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:139)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

Which version are you using? I fixed a similar problem (related to the _all field) in 0.15.2: https://github.com/elasticsearch/elasticsearch/issues/closed#issue/743.
On Friday, March 11, 2011 at 12:33 AM, Matt Paul wrote:

All,

I'm getting a strange error (this is from the elasticsearch log, I just get the out of bounds error back via the REST interface) when doing a number of large bulk inserts, any idea on what might possibly be causing this? It's on a create operation (and there definitely is a lot of data going in)

java.lang.ArrayIndexOutOfBoundsException
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:442)
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:649)
at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:167)
at org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:50)
at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
at org.apache.lucene.analysis.StopFilter.incrementToken(StopFilter.java:141)
at org.elasticsearch.common.lucene.all.AllTokenStream.incrementToken(AllTokenStream.java:56)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:188)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:246)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:826)
at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:802)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1998)
at org.elasticsearch.index.engine.robin.RobinEngine.innerCreate(RobinEngine.java:305)
at org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:229)
at org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:264)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:139)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

Shay,

I've been using a build from source that I downloaded on 3/1 (It was
the first version that had the fix to my has_child facet issue, #730).
I'll download the latest and give that a try and let you know, thanks!

Matt

On Mar 11, 7:09 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Which version are you using? I fixed a similar problem (related to the _all field) in 0.15.2:Issues · elastic/elasticsearch · GitHub.

On Friday, March 11, 2011 at 12:33 AM, Matt Paul wrote:

All,

I'm getting a strange error (this is from the elasticsearch log, I just get the out of bounds error back via the REST interface) when doing a number of large bulk inserts, any idea on what might possibly be causing this? It's on a create operation (and there definitely is a lot of data going in)

java.lang.ArrayIndexOutOfBoundsException
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:442)
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:649)
at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:167)
at org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:50)
at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
at org.apache.lucene.analysis.StopFilter.incrementToken(StopFilter.java:141)
at org.elasticsearch.common.lucene.all.AllTokenStream.incrementToken(AllTokenStream.java:56)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:188)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:246)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:826)
at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:802)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1998)
at org.elasticsearch.index.engine.robin.RobinEngine.innerCreate(RobinEngine.java:305)
at org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:229)
at org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:264)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:139)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

Shay,

I downloaded the latest snapshot and that did indeed fix my issue,
thanks!

Matt

On Mar 11, 8:32 am, Matt Paul purplegher...@gmail.com wrote:

Shay,

I've been using a build from source that I downloaded on 3/1 (It was
the first version that had the fix to my has_child facet issue, #730).
I'll download the latest and give that a try and let you know, thanks!

Matt

On Mar 11, 7:09 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Which version are you using? I fixed a similar problem (related to the _all field) in 0.15.2:Issues · elastic/elasticsearch · GitHub.

On Friday, March 11, 2011 at 12:33 AM, Matt Paul wrote:

All,

I'm getting a strange error (this is from the elasticsearch log, I just get the out of bounds error back via the REST interface) when doing a number of large bulk inserts, any idea on what might possibly be causing this? It's on a create operation (and there definitely is a lot of data going in)

java.lang.ArrayIndexOutOfBoundsException
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:442)
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:649)
at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:167)
at org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:50)
at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
at org.apache.lucene.analysis.StopFilter.incrementToken(StopFilter.java:141)
at org.elasticsearch.common.lucene.all.AllTokenStream.incrementToken(AllTokenStream.java:56)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:188)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:246)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:826)
at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:802)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1998)
at org.elasticsearch.index.engine.robin.RobinEngine.innerCreate(RobinEngine.java:305)
at org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:229)
at org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:264)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:139)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)