ERROR: Elasticsearch died while starting up, with exit code 78

juancamiloll · January 16, 2025, 4:47pm

Today I found some errors in kibana where dasboards were not loading.

I have a cluster of 3 servers for elasticsearch, one for kibana, one for logstash and one for fleet server.

When I checked each of the elasticsearch servers they were like this
Server 1: elasticsearch service in Failed status
Server 2: elasticsearch service in running status
Server 3: elasticsearch service in Failed status

After reviewing the space of each one of them, they are as follows
Server 1: 47%.

Server 2: 46%.

Server 3: 44%.

On server 3 in the log I found the following message.

[2025-01-16T07:51:37,281][WARN ][o.e.c.r.a.DiskThresholdMonitor] [elastic-3] high disk watermark [90%] exceeded on [ite1RopSQxaEXeFsv3thrw][elastic-2][/home/
data/elasticsearch] free: 46.9gb[7.9%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to overflow.
The node is expected to continue to exceed the high disk watermark when these relocations are completed.

Does it make sense if I add these commands to the elasticsearch.yml file to each of the servers temporarily to have again management from kibana and debug the heavier indexes or datastreams?

cluster.routing.allocation.disk.watermark.low: 90%
cluster.routing.allocation.disk.watermark.high: 95%
cluster.routing.allocation.disk.watermark.flood_stage: 98%

The other more desperate measure is to list the heaviest indexes and delete them to free up space but I am worried if this action could generate a major problem.
du -sh /home/data/elasticsearch/indices/* |sort -rh

thank you in advance for any suggestions you may have.

leandrojmp · January 16, 2025, 5:16pm

Do not do this, you should never change the data files for elasticsearch manually, this will lead to data loss and may put your cluster on a state that is not possible to recover.

I don't think that hitting the flood stage watermark should affect the ability of Elasticsearch to start up, so your issue may be something else.

Since this is a systemd service you need to look in the system logs.

Try to start the elasticsearch service on one of the nodes that failed and check on /var/log/syslog or /var/log/messages for any hint.

juancamiloll · January 16, 2025, 6:17pm

@leandrojmp thank you very much for your recommendations

Elastic Server 1

root@elastic-1:/var/log# more syslog |grep -i error
Jan 15 16:03:06 elastic-1 systemd-entrypoint[364309]: java.lang.OutOfMemoryError: Java heap space
Jan 15 16:04:34 elastic-1 systemd-entrypoint[364309]: Terminating due to java.lang.OutOfMemoryError: Java heap space
Jan 15 16:04:37 elastic-1 systemd-entrypoint[364248]: ERROR: Elasticsearch exited unexpectedly, with exit code 3
Jan 16 09:07:51 elastic-1 systemd-entrypoint[3345067]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 09:08:22 elastic-1 systemd-entrypoint[3345067]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 09:50:19 elastic-1 systemd-entrypoint[3345353]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 09:50:51 elastic-1 systemd-entrypoint[3345353]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:11:48 elastic-1 systemd-entrypoint[3345540]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:12:19 elastic-1 systemd-entrypoint[3345540]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:20:50 elastic-1 systemd-entrypoint[3345693]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:21:21 elastic-1 systemd-entrypoint[3345693]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:52:00 elastic-1 systemd-entrypoint[3345843]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:52:31 elastic-1 systemd-entrypoint[3345843]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 12:49:45 elastic-1 systemd-entrypoint[3346226]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 12:50:16 elastic-1 systemd-entrypoint[3346226]: ERROR: Elasticsearch died while starting up, with exit code 78

Elastic Server 3

root@elastic-3:/home/ubunuser# more /var/log/syslog |grep -i error
Jan 16 08:24:31 elastic-3 systemd-entrypoint[580628]: java.lang.OutOfMemoryError: Java heap space
Jan 16 08:26:28 elastic-3 systemd-entrypoint[580628]: [7844382.807s][error][heapdump] Failed to merge segmented heap file (file: /var/lib/elasticsearch/java_pid580628.hprof)
Jan 16 08:26:30 elastic-3 systemd-entrypoint[580628]: Terminating due to java.lang.OutOfMemoryError: Java heap space
Jan 16 08:26:36 elastic-3 systemd-entrypoint[580567]: ERROR: Elasticsearch exited unexpectedly, with exit code 3
Jan 16 10:31:53 elastic-3 systemd-entrypoint[3460600]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:32:24 elastic-3 systemd-entrypoint[3460600]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:35:27 elastic-3 systemd-entrypoint[3460750]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:35:58 elastic-3 systemd-entrypoint[3460750]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:37:48 elastic-3 systemd-entrypoint[3460886]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:38:19 elastic-3 systemd-entrypoint[3460886]: ERROR: Elasticsearch died while starting up, with exit code 78
Jan 16 10:40:38 elastic-3 systemd-entrypoint[3461019]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log
Jan 16 10:41:09 elastic-3 systemd-entrypoint[3461019]: ERROR: Elasticsearch died while starting up, with exit code 78

leandrojmp · January 16, 2025, 6:25pm

What are the specs of your nodes? How many memory these machines have and how much is configured for the heap in jvm.options?

They are complaining about memory:

Jan 15 16:04:34 elastic-1 systemd-entrypoint[364309]: Terminating due to java.lang.OutOfMemoryError: Java heap space

Also, what do you have in the logs in the same time frame?

ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elacluster.log

juancamiloll · January 16, 2025, 6:35pm

@leandrojmp

jvm.options did not have any file created, I assigned 15 of ram and the error persisted, then I tried with 8GB of ram and it persisted.

LOG elacluster

root@elastic-1:/etc/elasticsearch/jvm.options.d# tail -50 /var/log/elasticsearch/elacluster.log
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-deprecation]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-fleet]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-aggregate-metric]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-downsample]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-profiling]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [ingest-geoip]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-write-load-forecaster]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [search-business-rules]
[2025-01-16T13:29:48,358][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [wildcard]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [ingest-attachment]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-apm-data]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-sql]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [unsigned-long]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-async]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [runtime-fields-common]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [vector-tile]
[2025-01-16T13:29:48,359][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [lang-expression]
[2025-01-16T13:29:48,360][INFO ][o.e.p.PluginsService     ] [elastic-1] loaded module [x-pack-eql]
[2025-01-16T13:29:49,604][INFO ][o.e.e.NodeEnvironment    ] [elastic-1] using [1] data paths, mounts [[/home (/dev/sdb)]], net usable_space [171.1gb], net total_space [589.5gb], types [ext4]
[2025-01-16T13:29:49,604][INFO ][o.e.e.NodeEnvironment    ] [elastic-1] heap size [8gb], compressed ordinary object pointers [true]
[2025-01-16T13:29:49,778][INFO ][o.e.n.Node               ] [elastic-1] node name [elastic-1], node ID [St_nPtbcQnWvb4fZJvY7HA], cluster name [elacluster], roles [data_cold, data, remote_cluster_client, master, data_warm, data_content, transform, data_hot, ml, data_frozen, ingest]
[2025-01-16T13:29:54,033][INFO ][o.e.i.r.RecoverySettings ] [elastic-1] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]
[2025-01-16T13:29:54,211][INFO ][o.e.f.FeatureService     ] [elastic-1] Registered local node features [data_stream.auto_sharding, data_stream.lifecycle.global_retention, data_stream.rollover.lazy, desired_node.version_deprecated, esql.agg_values, esql.async_query, esql.base64_decode_encode, esql.casting_operator, esql.counter_types, esql.disable_nullable_opts, esql.from_options, esql.metadata_fields, esql.metrics_counter_fields, esql.mv_ordering_sorted_ascending, esql.mv_sort, esql.spatial_points_from_source, esql.spatial_shapes, esql.st_centroid_agg, esql.st_contains_within, esql.st_disjoint, esql.st_intersects, esql.st_x_y, esql.string_literal_auto_casting, esql.string_literal_auto_casting_extended, esql.timespan_abbreviations, features_supported, file_settings, geoip.downloader.database.configuration, health.dsl.info, health.extended_repository_indicator, knn_retriever_supported, license-trial-independent-version, mapper.index_sorting_on_nested, mapper.keyword_dimension_ignore_above, mapper.pass_through_priority, mapper.range.null_values_off_by_one_fix, mapper.source.synthetic_source_fallback, mapper.source.synthetic_source_stored_fields_advance_fix, mapper.track_ignored_source, mapper.vectors.bit_vectors, mapper.vectors.int4_quantization, rest.capabilities_action, retrievers_supported, rrf_retriever_supported, script.hamming, search.vectors.k_param_supported, security.migration_framework, security.roles_metadata_flattened, standard_retriever_supported, stats.include_disk_thresholds, text_similarity_reranker_retriever_supported, unified_highlighter_matched_fields, usage.data_tiers.precalculate_stats]
[2025-01-16T13:29:54,694][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [elastic-1] [controller/3346710] [Main.cc@123] controller (64 bit): Version 8.15.3 (Build 44a990dc4c07de) Copyright (c) 2024 Elasticsearch BV
[2025-01-16T13:29:54,936][INFO ][o.e.t.a.APM              ] [elastic-1] Sending apm metrics is disabled
[2025-01-16T13:29:54,936][INFO ][o.e.t.a.APM              ] [elastic-1] Sending apm tracing is disabled
[2025-01-16T13:29:54,967][INFO ][o.e.x.s.Security         ] [elastic-1] Security is enabled
[2025-01-16T13:29:55,614][INFO ][o.e.x.s.a.s.FileRolesStore] [elastic-1] parsed [0] roles from file [/etc/elasticsearch/roles.yml]
[2025-01-16T13:29:55,815][INFO ][o.e.x.s.InitialNodeSecurityAutoConfiguration] [elastic-1] Auto-configuration will not generate a password for the elastic built-in superuser, as we cannot  determine if there is a terminal attached to the elasticsearch process. You can use the `bin/elasticsearch-reset-password` tool to set the password for the elastic user.
[2025-01-16T13:29:56,074][INFO ][o.e.x.w.Watcher          ] [elastic-1] Watcher initialized components at 2025-01-16T18:29:56.073Z
[2025-01-16T13:29:56,151][INFO ][o.e.x.p.ProfilingPlugin  ] [elastic-1] Profiling is enabled
[2025-01-16T13:29:56,171][INFO ][o.e.x.p.ProfilingPlugin  ] [elastic-1] profiling index templates will not be installed or reinstalled
[2025-01-16T13:29:56,175][INFO ][o.e.x.a.APMPlugin        ] [elastic-1] APM ingest plugin is enabled
[2025-01-16T13:29:56,217][INFO ][o.e.x.a.APMIndexTemplateRegistry] [elastic-1] APM index template registry is enabled
[2025-01-16T13:29:56,762][INFO ][o.e.t.n.NettyAllocator   ] [elastic-1] creating NettyAllocator with the following configs: [name=elasticsearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={es.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=4mb}]
[2025-01-16T13:29:56,821][INFO ][o.e.d.DiscoveryModule    ] [elastic-1] using discovery type [multi-node] and seed hosts providers [settings]
[2025-01-16T13:29:58,177][INFO ][o.e.n.Node               ] [elastic-1] initialized
[2025-01-16T13:29:58,178][INFO ][o.e.n.Node               ] [elastic-1] starting ...
[2025-01-16T13:29:58,238][INFO ][o.e.x.s.c.f.PersistentCache] [elastic-1] persistent cache index loaded
[2025-01-16T13:29:58,239][INFO ][o.e.x.d.l.DeprecationIndexingComponent] [elastic-1] deprecation component started
[2025-01-16T13:29:58,324][INFO ][o.e.t.TransportService   ] [elastic-1] publish_address {172.26.6.36:9300}, bound_addresses {172.26.6.36:9300}
[2025-01-16T13:30:02,523][INFO ][o.e.b.BootstrapChecks    ] [elastic-1] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2025-01-16T13:30:02,578][ERROR][o.e.b.Elasticsearch      ] [elastic-1] node validation exception
[1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch. For more information see [https://www.elastic.co/guide/en/elasticsearch/reference/8.15/bootstrap-checks.html]
bootstrap check failure [1] of [1]: memory locking requested for elasticsearch process but memory is not locked; for more information see [https://www.elastic.co/guide/en/elasticsearch/reference/8.15/_memory_lock_check.html]
[2025-01-16T13:30:02,584][INFO ][o.e.n.Node               ] [elastic-1] stopping ...
[2025-01-16T13:30:02,609][INFO ][o.e.n.Node               ] [elastic-1] stopped
[2025-01-16T13:30:02,609][INFO ][o.e.n.Node               ] [elastic-1] closing ...
[2025-01-16T13:30:32,632][INFO ][o.e.n.Node               ] [elastic-1] closed
[2025-01-16T13:30:32,636][INFO ][o.e.x.m.p.NativeController] [elastic-1] Native controller process has stopped - no new native processes can be started

leandrojmp · January 16, 2025, 6:45pm

[2025-01-16T13:30:02,578][ERROR][o.e.b.Elasticsearch ] [elastic-1] node validation exception
[1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch. For more information see [Bootstrap Checks | Elasticsearch Guide [8.15] | Elastic]
bootstrap check failure [1] of [1]: memory locking requested for elasticsearch process but memory is not locked; for more information see [Memory lock check | Elasticsearch Guide [8.15] | Elastic]

Did something changed in your system? It is failing bootstrap checks, this normally prevents the start on the first time you start it, not sure why you are getting this now.

Also, the jvm.options is created during installing, on version 8 I think you should use a jvm.options file inside of /etc/elasticsearch/jvm.options.d.

Try to add this line into your elasticsearch.yml and see if it starts, but something does not seem right.

bootstrap.memory_lock: true

juancamiloll · January 16, 2025, 7:39pm

@leandrojmp Solved

I entered the following path and I added the following line then restarted elasticsearch and it uploaded without errors.

/lib/systemd/system/elasticsearch.service

thanks for your valuable assistance

Topic		Replies	Views
Kibana server is not ready yet Kibana	11	328	January 31, 2024
Geeting "Status : RED " error Elasticsearch	6	673	April 3, 2018
High disk watermark in elastcisearch Elasticsearch	4	8951	July 6, 2017
ELK service stops when space is limited Elasticsearch	4	227	November 14, 2023
High disk watermark [90%] exceeded on node,shards will be relocated away from this node Elasticsearch	5	20731	October 4, 2017

ERROR: Elasticsearch died while starting up, with exit code 78

Related topics