FlushNotAllowedEngineException exception during optimize

(Dariusp) #1

I am running 1.5.2 elasticsearch. My strategy to index data is to create new index, push all data, apply settings to index (example number_of_replicas, alias).
Sometimes I get weird exception in server logs during optimize call

[action.admin.indices.optimize] [node] [my_index][2], node[DDoIG8BySM6AIWrmiZDwyw], [P], s[STARTED]: failed to execute [OptimizeRequest{maxNumSegments=5, onlyExpungeDeletes=false, flush=true, upgrade=false}] 
org.elasticsearch.index.engine.OptimizeFailedEngineException: [my_index][2] force merge failed 
        at org.elasticsearch.index.engine.InternalEngine.forceMerge(InternalEngine.java:791) 
        at org.elasticsearch.index.shard.IndexShard.optimize(IndexShard.java:684) 
        at org.elasticsearch.action.admin.indices.optimize.TransportOptimizeAction.shardOperation(TransportOptimizeAction.java:110) 
        at org.elasticsearch.action.admin.indices.optimize.TransportOptimizeAction.shardOperation(TransportOptimizeAction.java:49) 
        at org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$AsyncBroadcastAction$1.run(TransportBroadcastOperationAction.java:171) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
        at java.lang.Thread.run(Thread.java:745) 
Caused by: org.elasticsearch.index.engine.FlushNotAllowedEngineException: [my_index][2] recovery is in progress, flush with committing translog is not allowed 
        at org.elasticsearch.index.engine.InternalEngine.flush(InternalEngine.java:601) 
        at org.elasticsearch.index.engine.InternalEngine.forceMerge(InternalEngine.java:782)

code which reproduce this is

// creating index with optimized settings for indexing
Settings indexSettings = ImmutableSettings.settingsBuilder()
        .put("number_of_shards", 5)
        .put("number_of_replicas", 0)
        .put("refresh_interval", "-1")
        .put("merge.policy.merge_factor", "50")
        .put("index.merge.scheduler.max_thread_count", "1")
        .addMapping(mappingName, getFieldMapping())

// push data to elastic server
bulkProcessor.add(new IndexRequest(esIndexName, esMappingName, id).source(source)); // many times ~10mln

// set index settings
Settings indexSettings = ImmutableSettings.settingsBuilder()
        .put("number_of_replicas", 2)
        .put("refresh_interval", "5s")

I guess that problem is that I set number_of_replicas, than while elastic replicating data, I call optimize command. Can be there a problem? If yes exists some way to know when index is prepared for optimize ?

(Jason Wee) #2

why do you do these programmatically than going through the api exposed by elasticsearch? especially on the optimize command.

(Dariusp) #3

I use TransportClient, because it perform faster than REST API. Are there any differences between calling optimize with REST api or TransportClient?

(Mike Simos) #4

I suggest you call optimize before adding the replicas. Otherwise its going to have to optimize the replicas too. And its probably failing because it hasn't finished creating the replicas. If you call optimize before adding replicas, then it will merge to 5 segments. And when you add the replicas it should have the same amount of segments for the replica shards. This procedure is documented here:

"Use 0 replicas while building up your initial large index, and then enable replicas later on and let them catch up. Just beware that a node failure when you have 0 replicas means you have lost data (your cluster is red) since there is no redundancy. If you plan to call optimize (because no more documents will be added), it is a good idea to do that after finishing indexing and before increasing the replica count so replication can just copy the optimized segment(s). See update index settings for details."

(Dariusp) #5

thanks Mike, it solve my problem

(system) #6