Environment
- Elasticsearch version: 7.4.2 (docker)
Problem Summary
If Elasticsearch already finds the file /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
during bootstrap checks, it fails to start
Steps to reproduce (PRODUCTION)
Start the Elasticsearch docker container, and stop it when it is creating elasticsearch.keystore (this is very difficult to do manually, but it's exacly what happened in our case)
Steps to reproduce (SIMULATION)
Create a custom docker-compose file where you map a volume that already contains an empty elasticsearch.keystore.tmp
file mapped to /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
, like the following:
version: '3.7'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.4.2
environment:
- node.name=es01
- cluster.name=es-sample-cluster
- discovery.type=single-node
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms4g -Xmx4g"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./elasticsearch.keystore.tmp:/usr/share/elasticsearch/config/elasticsearch.keystore.tmp
ports:
- 9200:9200
Expected Result
Elasticsearch starts correctly
Actual Result
Elasticsearch does not start, and fails with the following error:
elasticsearch_1 | Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
elasticsearch_1 | Likely root cause: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
elasticsearch_1 | at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:94)
elasticsearch_1 | at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
elasticsearch_1 | at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
elasticsearch_1 | at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
elasticsearch_1 | at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:478)
elasticsearch_1 | at java.base/java.nio.file.Files.newOutputStream(Files.java:223)
elasticsearch_1 | at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:410)
elasticsearch_1 | at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:406)
elasticsearch_1 | at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:254)
elasticsearch_1 | at org.elasticsearch.common.settings.KeyStoreWrapper.save(KeyStoreWrapper.java:484)
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.loadSecureSettings(Bootstrap.java:242)
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:305)
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
elasticsearch_1 | at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
elasticsearch_1 | at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125)
elasticsearch_1 | at org.elasticsearch.cli.Command.main(Command.java:90)
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)
elasticsearch_1 | Refer to the log for complete error details.
das_elasticsearch_1 exited with code 1
Considerations
I narrowed down the problem to the following method: https://github.com/elastic/elasticsearch/blob/v7.4.2/server/src/main/java/org/elasticsearch/common/settings/KeyStoreWrapper.java#L478
The KeyStoreWrapper.save()
method does not explicitly handle java.nio.file.FileAlreadyExistsException
, so it simply fails and exits. I would expect that this exception would be handled explicitly, logged, and would allow the service to start all the same...