Hello,
since some time we can find the following warning in certain time intervals in logs of elastic in our environments. Usually this has no effect, but sometimes not all shards are assigned and you have to do a retry here.
[2022-11-24T00:04:38,264][WARN ][o.e.i.e.Engine ] [server-md] [logs-2022-11-23][0]failed engine [lucene commit failed]
org.apache.lucene.store.AlreadyClosedException: Underlying file changed by an external force at 2022-11-22T23:00:13Z, (lock=NativeFSLock(path=/path/elastic/var/data/storage/md/nodes/0/indices/Zz9LjRRcRhqSBFv7Cc2k2Q/0/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],creationTime=2022-11-22T23:00:13.384888Z))
At first we thought that another process is touching the indexes of Elastic. We checked with auditctl if there are other users accessing the indexes. However, this is not the case.
Afterwards we noticed that the timestamp (atime and mtime) is always exactly 0 milliseconds.
stat /path/elastic/var/data/storage/md/nodes/0/indices/…/0/index/write.lock
File: '/path/elastic/var/data/storage/md/nodes/0/indices/…/0/index/write.lock'
Size: 0 Blocks: 0 IO Block: 4096 regular empty File
Device: fe05h/65029d Inode: 71303504 Verknüpfungen: 1
Access: (0644/-rw-r--r--) Uid: ( 1216/ elastic) Gid: ( 1216/ elastic)
Access: 2022-11-30 00:00:09.000000000 +0100
Modify: 2022-11-30 00:00:09.000000000 +0100
Change: 2022-11-30 22:01:07.559373558 +0100
Birth : -
In a single case this would also be strange, but this occurs with 3202 files.
ls -aRl --time-style=full-iso | grep -c 00000
3202
However, if you run a touch command on the system they will be set correctly.
stat touch_file
File: 'touch_file'
Größe: 0 Blöcke: 0 EA Block: 4096 regular empty File
Device: fe04h/65028d Inode: 1460 Verknüpfungen: 1
Access: (0644/-rw-r--r--) Uid: ( 8057/ user) Gid: ( 100/ users)
Access: 2022-12-01 08:07:31.168801039 +0100
Modify: 2022-12-01 08:07:31.168801039 +0100
Change: 2022-12-01 08:07:31.168801039 +0100
Birth : -
So the question is what is the cause of the missing nanoseconds. However, this is most likely the cause of the error, since the comparison cannot be performed correctly. The comparison must fail, because the creationTime is in the future, because the milliseconds are present here, but not with the file.
creationTime=2022-11-29T23:00:09.595968Z > Underlying file changed by an external force at 2022-11-29T23:00:09Z
We are using in SLES12-SP5/20220823 and as file system XFS and the following Elastic version:
"version": {
"number": "7.10.2",
.....
"lucene_version": "8.7.0",
"minimum_wire_compatibility_version": "6.8.0",
"minimum_index_compatibility_version": "6.0.0-beta1"
}
The directory is mounted with the following parameters
/dev/mapper/vg00-path on /path type xfs (rw,noatime,attr2,inode64,noquota)
Do you have any ideas what could be responsible for the error?
Thanks in advance!