Getting "EngineClosedException" in ES server with new index

In indexing a bunch of documents, I'm deleting all indexes, recreating them, and indexing.

I'm not sure exactly what in the client code is causing this, but here's the exception(s) I'm getting (in ES 2.2.0) - any clues?

[2016-02-26 15:35:29,339][INFO ][node ] [Ivan Kragoff] version[2.2.0], pid[8632], build[8ff36d1/2016-01-27T13:32:39Z]
[2016-02-26 15:35:29,347][INFO ][node ] [Ivan Kragoff] initializing ...
[2016-02-26 15:35:29,910][INFO ][plugins ] [Ivan Kragoff] modules [lang-groovy, lang-expression], plugins [], sites []
[2016-02-26 15:35:29,948][INFO ][env ] [Ivan Kragoff] using [1] data paths, mounts [[Win7x64Pro (C:)]], net usable_space [52.1gb], net total_space [223.5gb], spins? [unknown], types [NTFS]
[2016-02-26 15:35:29,949][INFO ][env ] [Ivan Kragoff] heap size [910.5mb], compressed ordinary object pointers [true]
[2016-02-26 15:35:32,507][INFO ][node ] [Ivan Kragoff] initialized
[2016-02-26 15:35:32,508][INFO ][node ] [Ivan Kragoff] starting ...
[2016-02-26 15:35:32,845][INFO ][transport ] [Ivan Kragoff] publish_address {}, bound_addresses {[::1]:9300}, {}
[2016-02-26 15:35:32,855][INFO ][discovery ] [Ivan Kragoff] elasticsearch/88eryboWSUi7RUSeUjbrYg
[2016-02-26 15:35:36,886][INFO ][cluster.service ] [Ivan Kragoff] new_master {Ivan Kragoff}{88eryboWSUi7RUSeUjbrYg}{}{}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-02-26 15:35:36,959][INFO ][gateway ] [Ivan Kragoff] recovered [0] indices into cluster_state
[2016-02-26 15:35:37,190][INFO ][http ] [Ivan Kragoff] publish_address {}, bound_addresses {[::1]:9200}, {}
[2016-02-26 15:35:37,190][INFO ][node ] [Ivan Kragoff] started
[2016-02-26 15:38:14,861][INFO ][cluster.metadata ] [Ivan Kragoff] [patdocument-2] creating index, cause [api], templates [], shards [1]/[0], mappings [patdocument]
[2016-02-26 15:38:15,158][INFO ][cluster.metadata ] [Ivan Kragoff] [patdocument-2] creating index, cause [api], templates [], shards [1]/[0], mappings [patdocument]
[2016-02-26 15:38:15,239][INFO ][cluster.routing.allocation] [Ivan Kragoff] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[patdocument-2][0]] ...]).
[2016-02-26 15:38:16,116][WARN ][threadpool ] [Ivan Kragoff] failed to run org.elasticsearch.index.shard.IndexShard$EngineRefresher@3caefd97
[patdocument-2][[patdocument-2][0]] EngineClosedException[CurrentState[CLOSED] Closed]
at org.elasticsearch.index.shard.IndexShard.engine(
at org.elasticsearch.index.shard.IndexShard$
at org.elasticsearch.threadpool.ThreadPool$
at java.util.concurrent.Executors$
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(
at java.util.concurrent.ScheduledThreadPoolExecutor$
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
[2016-02-26 15:38:25,373][INFO ][cluster.metadata ] [Ivan Kragoff] [patdocument-50] creating index, cause [api], templates [], shards [1]/[0], mappings [patdocument]
[2016-02-26 15:38:25,504][INFO ][cluster.metadata ] [Ivan Kragoff] [patdocument-50] creating index, cause [api], templates [], shards [1]/[0], mappings [patdocument]
[2016-02-26 15:38:25,580][INFO ][cluster.routing.allocation] [Ivan Kragoff] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[patdocument-50][0]] ...]).
[2016-02-26 15:38:26,466][WARN ][threadpool ] [Ivan Kragoff] failed to run org.elasticsearch.index.shard.IndexShard$EngineRefresher@11714ec1

What's the output from _cat/indices? It could be that an index is closed, but it can't hurt to check.

Output showed all indices open. These seems to have happened soon after the index is created - perhaps there is some "settling period" now in 2.2.0 that wasn't in 1.5.2 for newly created indexes?

I'll check and see what the client is doing when this happens, and report back.

  • Tim


Well, it seems that's the problem anyway, at least in the use case I've got.

In my code, methods are not aware of other methods having been called (this is on purpose), so an index in my use case may be deleted and recreated several times in succession.

I find that if I put a 1 second delay after each index creation, I don't get the above exceptions in the ES server. If that delay is not there, and even though the Java API tells me that index creation is complete, I get that error later on (I think it happens when delete is once again called).

Here is some code that demonstrates this happening in the server (see the commented out call to Thread.sleep() - the code works when that is enabled). By the way, is there a better place I should be reporting this than this forum?

  • Tim

Why are you repeatedly deleting and recreating the same index?

As mentioned, the creations and deletions are happening in methods way down the call stack, and they are not aware of each other or what has been done prior. They attempt to start from a known state (delete an index if it exists, create it if it doesn't); in practice, it will produce - on a server without a given index - "create/delete/create" for each indexing stage.

It may be inefficient, but this happens only when we do a full reindexing from scratch, which isn't very often.

In any case, why does it matter? Doing so shouldn't be producing such an exception in the server, don't you think?

  • Tim

Deleting and creating indices both require the cluster state to be updated and propagated across the cluster in a consistent manner, and are therefore not instantaneous. If you are reindexing a single index, have you looked into using aliases to reindex without downtime?

We are reindexing many indexes when we do a full reindexing.

However, we are not clustering - this reindexing is against a single node with a single shard, I believe, at least in development (I haven't spent a lot of time configuring the servers for efficiency or load - they're pretty close to default setup right now, except for a few items). Shouldn't that happen pretty quick?

In production, we may, in the future, add some more spreading out of the nodes and shards.

Is there any way to delete and create an index, and know when it has been fully completed and propagated, so that I can proceed with further operations? Also, why can't the server simply pause processing an operation until the prior one is complete, so that this exception does not occur?

Meanwhile, I'll look at your link there.

  • Tim