Elasticsearch keeps restarting, despite being in a ready state. I've installed it using the official Elasticsearch chart, with default settings. I raised this question on the chart's github project and have been redirected here.
Chart version:
$ helm search repo elasticsearch
elastic/elasticsearch 7.17.3
Kubernetes version:
$ k version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"ba58f86b00f6b0f0b7694a75464aa7806f8bf6fc", GitTreeState:"clean", BuildDate:"2022-03-30T23:40:46Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes provider:
AKS
Helm Version:
$ helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}
Describe the bug:
If I install using the details provided above (without any values file) then Elasticsearch fails after a short period of being ready.
Steps to reproduce:
$ helm install elasticsearch elastic/elasticsearch
Provide logs and/or server output (if relevant):
$ k get pods -l app=elasticsearch-master -w
NAME READY STATUS RESTARTS AGE
elasticsearch-master-0 0/1 Init:0/1 0 27s
elasticsearch-master-1 0/1 Running 0 27s
elasticsearch-master-2 0/1 Init:0/1 0 27s
elasticsearch-master-2 0/1 Init:0/1 0 48s
elasticsearch-master-2 0/1 PodInitializing 0 51s
elasticsearch-master-2 0/1 Running 0 52s
elasticsearch-master-0 0/1 PodInitializing 0 66s
elasticsearch-master-0 0/1 Running 0 67s
elasticsearch-master-0 1/1 Running 0 2m
elasticsearch-master-1 1/1 Running 0 2m
elasticsearch-master-2 1/1 Running 0 2m3s
elasticsearch-master-1 1/1 Terminating 0 2m24s
elasticsearch-master-1 0/1 Terminating 0 2m25s
elasticsearch-master-1 0/1 Terminating 0 2m25s
elasticsearch-master-1 0/1 Terminating 0 2m25s
elasticsearch-master-1 0/1 Pending 0 0s
elasticsearch-master-1 0/1 Pending 0 0s
elasticsearch-master-1 0/1 Init:0/1 0 0s
elasticsearch-master-1 0/1 PodInitializing 0 5s
elasticsearch-master-1 0/1 Running 0 6s
elasticsearch-master-1 1/1 Running 0 60s
The logs are rather long, so here's just the ending bit:
{"type": "server", "timestamp": "2022-05-18T08:34:01,977Z", "level": "INFO", "component": "o.e.c.r.DelayedAllocationService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "scheduling reroute for delayed shards in [59.9s] (1 delayed shards)", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:02,154Z", "level": "INFO", "component": "o.e.i.g.GeoIpDownloader", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "updating geoip databases", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:02,155Z", "level": "INFO", "component": "o.e.i.g.GeoIpDownloader", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "fetching geoip databases overview from [https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree]", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:02,812Z", "level": "INFO", "component": "o.e.i.g.GeoIpDownloader", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "geoip database [GeoLite2-ASN.mmdb] is up to date, updated timestamp", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:02,882Z", "level": "INFO", "component": "o.e.i.g.GeoIpDownloader", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "geoip database [GeoLite2-City.mmdb] is up to date, updated timestamp", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:02,954Z", "level": "INFO", "component": "o.e.i.g.GeoIpDownloader", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "geoip database [GeoLite2-Country.mmdb] is up to date, updated timestamp", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:41,440Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "node-join[{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw} join existing leader], term: 5, version: 80, delta: added {{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw}}", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:42,462Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "added {{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw}}, term: 5, version: 80, reason: Publication{term=5, version=80}", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:34:43,552Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.geoip_databases][0]]]).", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:36:01,684Z", "level": "INFO", "component": "o.e.t.ClusterConnectionManager", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "transport connection to [{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw}] closed by remote", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:36:01,686Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "Cluster health status changed from [GREEN] to [YELLOW] (reason: [{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw} reason: disconnected]).", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:36:01,687Z", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "node-left[{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw} reason: disconnected], term: 5, version: 83, delta: removed {{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw}}", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:36:01,703Z", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "removed {{elasticsearch-master-0}{6yeN9JukTkKW2q5-NoBAYQ}{GDkulEhxQCW9Ek_qgGPijQ}{10.244.13.205}{10.244.13.205:9300}{cdfhilmrstw}}, term: 5, version: 83, reason: Publication{term=5, version=83}", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
{"type": "server", "timestamp": "2022-05-18T08:36:01,713Z", "level": "INFO", "component": "o.e.c.r.DelayedAllocationService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-2", "message": "scheduling reroute for delayed shards in [59.9s] (1 delayed shards)", "cluster.uuid": "5rm76qDXRfmfn4ZzfOuHfQ", "node.id": "jJY1CYS7ReiQdY0fSgRUiQ" }
Any additional context:
There is a stacktrace or two in the logs. I think they're harmless, but here's the first few lines of one:
{"type": "server", "timestamp": "2022-05-18T08:36:41,818Z", "level": "INFO", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0", "message": "retrieve geoip database [GeoLite2-ASN.mmdb] from [.geoip_databases] to [/tmp/elasticsearch-10444144470664545833/geoip-databases/6yeN9JukTkKW2q5-NoBAYQ/GeoLite2-ASN.mmdb.tmp.gz]" }
{"type": "server", "timestamp": "2022-05-18T08:36:41,815Z", "level": "ERROR", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0", "message": "failed to retrieve database [GeoLite2-City.mmdb]",
"stacktrace": ["org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:179) ~[elasticsearch-7.17.3.jar:7.17.3]",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:165) ~[elasticsearch-7.17.3.jar:7.17.3]",
"at org.elasticsearch.action.search.TransportSearchAction.executeSearch(TransportSearchAction.java:929) ~[elasticsearch-7.17.3.jar:7.17.3]",
"at org.elasticsearch.action.search.TransportSearchAction.executeLocalSearch(TransportSearchAction.java:763) ~[elasticsearch-7.17.3.jar:7.17.3]",
"at org.elasticsearch.action.search.TransportSearchAction.lambda$executeRequest$6(TransportSearchAction.java:399) ~[elasticsearch-7.17.3.jar:7.17.3]",
"at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) ~[elasticsearch-7.17.3.jar:7.17.3]",
The others complain about different GeoLite2 databases.