Unknown error in TransportShardSingleOperationAction.java

Hello,

I have a program where I am sending bursts of bulk index requests within a
short time to elasticsearch (1.1) using the java API. I send in 1000
documents (in a bulk request) every 2-5 seconds - initially I was running
into NoNodeAvailableException/NoShardAvailableException (on the client
side) and OutOfMemoryException (on the server). To solve this, after every
few bulk requests I wait for a while before sending any more requests (1s
wait after 10 bulk requests, 60s wait after every 30 bulk requests - I got
these numbers after random trial and error in my configuration). With this
waiting, my program ran much longer, but now I have run into an unknown
error after I inserted around 3GB (~80,000 documents) of data. The error is

${
pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}${pattern}org.elasticsearch.action.NoShardAvailableActionException:
[records][1] nullat
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(
TransportShardSingleOperationAction.java:145)at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.onFailure(
TransportShardSingleOperationAction.java:132)at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.access$900(
TransportShardSingleOperationAction.java:97)at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction$1.run(
TransportShardSingleOperationAction.java:166)at
java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:744)

The corresponding error on the server is -

org.elasticsearch.index.engine.EngineException: [records][1] this
ReferenceManager is closed at
org.elasticsearch.index.engine.internal.InternalEngine.acquireSearcher(InternalEngine.java:662)
at
org.elasticsearch.index.engine.internal.InternalEngine.loadCurrentVersionFromIndex(InternalEngine.java:1317)
at
org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:495)
at
org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:470)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:396)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:401)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:157)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)

Additional information -

1)I also run into occasional ReceiveTimeoutTransportExceptions on GET
requests (on client) along with several gc memory warnings and some
outofmemory errors (on server).

May 08, 2014 1:42:05 PM org.elasticsearch.client.transport INFO: [Yuri
Topolov] failed to get node info for
[#transport#-1][.local][inet[localhost/127.0.0.1:9300]], disconnecting...

org.elasticsearch.transport.ReceiveTimeoutTransportException:
[][inet[localhost/127.0.0.1:9300]][cluster/nodes/info] request_id [12345]
timed out after [5001ms] at
org.elasticsearch.transport.TransportService$TimeoutHandler.run(
TransportService.java:356) at
java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744)

  1. Before the bulk index requests, I get the current document content in
    elastic search using individual GET requests (around 1000 requests/second,
    instead of a bulk GET request)

  2. Configuration - I am currently running both my java program and elastic
    search (1.1) on the same machine in a developer environment with default
    configurations.

So a few questions I have are following -

  1. How can I get rid of this error?

  2. Is there any way I can query elastic search if it is ready to index more
    files before I send it more requests (so that I can get rid of the
    arbitrary wait times which will make my code more robust to run in various
    workloads. I have around 10x more data to process for which I will deploy
    elastic search on a dedicated cluster and would like to process it as fast
    as possible)

  3. Will using bulk Get/Scroll requests instead of thousands of individual
    get requests make elastic search run better (I currently run each get
    request on a thread pool which gets data from elastic search, processes it
    and updates a hashmap)

  4. Is there any other way I can make my overall process better, if possible.

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4eee1128-e8fc-40fb-b6a1-be787dc87dc7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.