ES cluster in Docker containers : AlreadyClosedException Underlying file changed by an external force

Hello everyone,
I have this exception when trying to update some documents in one of my indexes.
I will post some of my elasticsearch cluster configuration (es version, index state) in next post.

[2016-04-28 08:51:57,455][WARN ][cluster.action.shard ] [Lightspeed] [caparmor_2015][3] received shard failed for target shard [[caparmor_2015][3], node[Y1kc958iT46nd-EtVbUKgQ], [P], v[7], s[STARTED], a[id=8C35eXqIR8WZ0sikSeRDqQ]], indexUUID [fVYBjB8-SSenL4KxCk_3TA], message [engine failure, reason [already closed by tragic event on the index writer]], failure [AlreadyClosedException[Underlying file changed by an external force at 2016-04-15T10:59:09.550936Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/caparmor/nodes/0/indices/caparmor_2015/3/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-15T10:59:09.550936Z))]]
org.apache.lucene.store.AlreadyClosedException: Underlying file changed by an external force at 2016-04-15T10:59:09.550936Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/caparmor/nodes/0/indices/caparmor_2015/3/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-15T10:59:09.550936Z))
at org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)
at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:43)
at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.(CompressingStoredFieldsWriter.java:108)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
at org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:81)
at org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:279)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:316)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:273)
at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1318)
at org.elasticsearch.index.engine.InternalEngine.innerIndex(InternalEngine.java:539)
at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:457)
at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:601)
at org.elasticsearch.index.engine.Engine$Index.execute(Engine.java:836)
at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:237)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:326)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardUpdateOperation(TransportShardBulkAction.java:389)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:191)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:639)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:271)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-04-28 08:51:57,458][INFO ][cluster.routing.allocation] [Lightspeed] Cluster health status changed from [GREEN] to [YELLOW] (reason: [shards failed [[caparmor_2015][3]] ...]).
[2016-04-28 08:52:32,788][INFO ][cluster.routing.allocation] [Lightspeed] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[caparmor_2015][3]] ...]).

curl -XGET localhost:9200

{
"name" : "Lightspeed",
"cluster_name" : "caparmor",
"version" : {
"number" : "2.3.1",
"build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39",
"build_timestamp" : "2016-04-04T12:25:05Z",
"build_snapshot" : false,
"lucene_version" : "5.5.0"
},
"tagline" : "You Know, for Search"
}

curl -XGET 'localhost:9200/_cat/indices/caparmor_2015?v'

health status index pri rep docs.count docs.deleted store.size pri.store.size
green open caparmor_2015 5 1 7459462 183 6.6gb 3.3gb

docker --version

Docker version 1.10.3, build 20f81dd

Elasticsearch data is not stored inside the containers, I made a volume mount on each host in order to keep data ok if something went bad with the cluster.

So, I don't really know what I could do to this... Thanks in advance to those who might help me

Antoine

The ctime on the write.lock file changed.

Perhaps its a broken filesystem (e.g. NFS) or something like that. Other explanations include someone deleting the write.lock file manually while writers are still open against it.

But we know that the lock is no longer valid, so we fail hard rather than corrupt data.

Hi, thank you very much for your answer.

Here is what I get if I insist on trying to index data :

curl -XPUT 'vsyslog:9200/caparmor_2015/jobs/2?pretty' -d '
{
"jobname":"toto",
"user":"jdoe"
}'

{
"error" : {
"root_cause" : [ {
"type" : "index_failed_engine_exception",
"reason" : "Index failed for [jobs#2]",
"shard" : "2",
"index" : "caparmor_2015"
} ],
"type" : "index_failed_engine_exception",
"reason" : "Index failed for [jobs#2]",
"shard" : "2",
"index" : "caparmor_2015",
"caused_by" : {
"type" : "already_closed_exception",
"reason" : "Underlying file changed by an external force at 2016-04-15T10:59:09.01333Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/caparmor/nodes/0/indices/caparmor_2015/2/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-15T10:59:09.01333Z))"
}
},
"status" : 500
}


Second attempt :

 "caused_by" : {
  "type" : "already_closed_exception",
  "reason" : "Underlying file changed by an external force at 2016-04-15T12:12:23.633939Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/caparmor/nodes/0/indices/caparmor_2015/2/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-15T12:12:23.633939Z))"
}

Third attempt :

curl -XPUT 'vsyslog:9200/caparmor_2015/jobs/2?pretty' -d '
{
"jobname":"toto",
"user":"jdoe"
}'

{
"_index" : "caparmor_2015",
"_type" : "jobs",
"_id" : "2",
"_version" : 1,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"created" : true
}


Is there a better solution than insisting when indexing ?
I'm currently trying to find out the possible filesystem problems that could induce this problem....

I think it could be enough to even touch the write.lock file. I realize this may seem harsh, but we have to detect that our lock may no longer be valid because its e.g. a different file. inode number is imperfect, because inode numbers get recycled. so we use ctime for now. Just make sure nothing messes with write.lock, seriously don't even touch it :slight_smile:

It will be hard to debug your case because the exception message is bad: it doesn't include actual vs expected timestamps. I fixed that here: https://github.com/apache/lucene-solr/commit/3b9aa4d5cc42083aab8957e61327a90cbc267de3 but for now its going to make it harder to debug.

Thanks for you answer again.

After investigating a little, I could observe that when this problem occurs, the write.lock file is not deleted.
But something else happens :

I made a little script that lists the write.lock file for every shard in the cluster and tried to see what happens each time I want to index a document.

Here is an example to show the way it happens :

1> Try to index a simple document. ES chooses shard 1 for example
2> Shard 1 (primary) failed because of write.lock
3> Shard 1 is relocated to another node : new write.lock ile is created with the actual date
4> Retries to index. It seems like ES tries now with the replica of shard 1
5> Shard 1 (replica) is relocated also. New write.lock file is created as in 3>
6> Last attempt works well. No shard is relocated this time.

Is this the normal behaviour ?

Problem is that next time I will try to index and ES will choose this shard, the same thing will happen, I will have to insist two times before succeeding.

the question is why does shard 1 fail in 2> in the first place? What are you doing to that write.lock? Are you using a shared filesystem by any chance?

Hi, thanks for your answer.
I don't do anything to the write.lock file. In fact, I discovered its existence with these error messages.
But now, indeed, I look after it because it is what causes failures to my indexing attempts (sorry, my english may not be proper).

I am not using shared filesystems. ES data is on XFS, each fs is dedicated to one node.

I found the probable cause of these failures :
--> data directories are mounted on the docker container.
When I list file from the host :

-rw-r--r-- 1 ntp ssh 0 May 9 10:05 write.lock

When I list file from the docker container :

-rw-r--r-- 1 elasticsearch elasticsearch 0 May 9 08:05 write.lock

This is because my docker containers are using this timezone

root@e89522442201:/usr/share/elasticsearch# cat /etc/timezone
Etc/UTC
whereas my host is using this timezone
root@vsyslog6:~ # cat /etc/timezone
Europe/Paris

I guess I will either have to change the timezone used by either one or the other ...

Did modifying the timezone actually resolve your issue?