Elasticsearch 7.13 crash - can not start again

Hi there, I have Elastic + Kiabana + Logstash + Filebeat installed on Ubuntu 20.04 server.
Yesterday everything worked fine together. I happily wrote requests in Kibana console. Today morning I tried to open Kibana but all I get is an error:

{"statusCode":503,"error":"Service Unavailable","message":"License is not available."}

Kibana, Filebeat and Logstash runs. Only elastic failed.

I did the installation according this tutorial

Elastic log file contains this and I am lost in it:

[2021-06-29T01:30:00,004][INFO ][o.e.x.m.MlDailyMaintenanceService] [vmi503579.contaboserver.net] triggering scheduled [ML] maintenance tasks
[2021-06-29T01:30:00,246][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [vmi503579.contaboserver.net] Deleting expired data
[2021-06-29T01:30:00,376][INFO ][o.e.x.m.j.r.UnusedStatsRemover] [vmi503579.contaboserver.net] Successfully deleted [0] unused stats documents
[2021-06-29T01:30:00,377][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [vmi503579.contaboserver.net] Completed deletion of expired ML data
[2021-06-29T01:30:00,378][INFO ][o.e.x.m.MlDailyMaintenanceService] [vmi503579.contaboserver.net] Successfully completed [ML] maintenance task: triggerDeleteExpiredDataTask
[2021-06-29T03:30:00,097][INFO ][o.e.x.s.SnapshotRetentionTask] [vmi503579.contaboserver.net] starting SLM retention snapshot cleanup task
[2021-06-29T03:30:00,118][INFO ][o.e.x.s.SnapshotRetentionTask] [vmi503579.contaboserver.net] there are no repositories to fetch, SLM retention snapshot cleanup task complete
[2021-06-29T05:30:13,713][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43948}] took [5990ms] which is above the warn threshold of [5000ms]
[2021-06-29T05:43:16,668][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43948}] took [5788ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:07:53,829][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43948}] took [6556ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:10:08,555][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][POST][/.kibana_task_manager/_update_by_query?ignore_unavailable=true&refresh=true&conflicts=proceed][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:42160}] took [6144ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:10:26,682][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43948}] took [8760ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:10:54,156][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:42134}] took [5090ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:11:57,929][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][POST][/_bulk?refresh=false&_source_includes=originId&require_alias=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43936}] took [5075ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:12:15,830][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:42134}] took [5469ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:18:10,461][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][young][32703][27] duration [1s], collections [1]/[1.8s], total [1s]/[4s], memory [4.7gb]->[117.9mb]/[7.8gb], all_pools {[young] [4.6gb]->[0b]/[0b]}{[old] [112.3mb]->[114.8mb]/[7.8gb]}{[survivor] [5.2mb]->[3.1mb]/[0b]}
[2021-06-29T06:18:10,867][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][32703] overhead, spent [1s] collecting in the last [1.8s]
[2021-06-29T06:19:17,955][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:42134}] took [5340ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:19:18,381][WARN ][o.e.h.AbstractHttpServerTransport] [vmi503579.contaboserver.net] handling request [null][GET][/_xpack?accept_enterprise=true][Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:43948}] took [5808ms] which is above the warn threshold of [5000ms]
[2021-06-29T06:21:22,114][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][32829] overhead, spent [1.3s] collecting in the last [1.4s]
[2021-06-29T06:23:50,235][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][young][32960][29] duration [1.4s], collections [1]/[2s], total [1.4s]/[6.8s], memory [502mb]->[117mb]/[7.8gb], all_pools {[young] [384mb]->[0b]/[0b]}{[old] [115mb]->[115.1mb]/[7.8gb]}{[survivor] [2.9mb]->[1.8mb]/[0b]}
[2021-06-29T06:23:50,238][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][32960] overhead, spent [1.4s] collecting in the last [2s]
[2021-06-29T06:29:20,161][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][young][33236][31] duration [3.4s], collections [1]/[4.3s], total [3.4s]/[10.3s], memory [504.6mb]->[118.5mb]/[7.8gb], all_pools {[young] [388mb]->[0b]/[0b]}{[old] [115.1mb]->[115.3mb]/[7.8gb]}{[survivor] [1.4mb]->[3.2mb]/[0b]}
[2021-06-29T06:29:20,167][WARN ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][33236] overhead, spent [3.4s] collecting in the last [4.3s]
[2021-06-29T06:31:56,552][INFO ][o.e.m.j.JvmGcMonitorService] [vmi503579.contaboserver.net] [gc][33350] overhead, spent [317ms] collecting in the last [1s]
[2021-06-29T14:06:47,417][INFO ][o.e.x.i.a.TransportPutLifecycleAction] [vmi503579.contaboserver.net] adding index lifecycle policy [filebeat]
[2021-06-29T14:06:51,639][INFO ][o.e.c.m.MetadataIndexTemplateService] [vmi503579.contaboserver.net] adding template [filebeat-7.13.2] for index patterns [filebeat-7.13.2-*]
[2021-06-29T14:42:04,610][INFO ][o.e.c.m.MetadataMappingService] [vmi503579.contaboserver.net] [.ds-ilm-history-5-2021.06.28-000001/Q0TUJjerTkWe0uLrgZBV0g] update_mapping [_doc]
[2021-06-29T14:42:57,086][INFO ][o.e.c.m.MetadataCreateIndexService] [vmi503579.contaboserver.net] [.async-search] creating index, cause [api], templates [], shards [1]/[0]
[2021-06-29T14:42:58,402][INFO ][o.e.c.m.MetadataMappingService] [vmi503579.contaboserver.net] [.kibana_7.13.2_001/7E-_QSVnT5afL1kf4O292A] update_mapping [_doc]
[2021-06-29T14:52:03,203][INFO ][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [1]
[2021-06-29T18:12:03,211][INFO ][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [11]
[2021-06-29T18:15:39,243][INFO ][o.e.c.m.MetadataCreateIndexService] [vmi503579.contaboserver.net] [test] creating index, cause [api], templates [], shards [1]/[1]
[2021-06-29T18:20:24,239][INFO ][o.e.c.m.MetadataMappingService] [vmi503579.contaboserver.net] [test/2I65fUJYQh2AQlQAps4LuA] create_mapping [_doc]
[2021-06-29T18:21:38,142][INFO ][o.e.a.b.TransportShardBulkAction] [vmi503579.contaboserver.net] [test][0] mapping update rejected by primary
org.elasticsearch.indices.InvalidTypeNameException: mapping type name [_search] can't start with '_' unless it is called [_doc]
	at org.elasticsearch.index.mapper.MapperService.validateTypeName(MapperService.java:509) ~[elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.index.mapper.MapperService.mergeMappings(MapperService.java:471) ~[elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.index.mapper.MapperService.mergeAndApplyMappings(MapperService.java:348) ~[elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:343) ~[elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:283) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:165) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:210) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:116) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:75) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:168) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.13.2.jar:7.13.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.13.2.jar:7.13.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:831) [?:?]
[2021-06-29T19:32:03,219][INFO ][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [15]
[2021-06-29T19:42:03,212][ERROR][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step
java.lang.IllegalArgumentException: index.lifecycle.rollover_alias [filebeat-7.13.2] does not point to index [filebeat-7.13.2-2021.06.29]
	at org.elasticsearch.xpack.core.ilm.WaitForRolloverReadyStep.evaluateCondition(WaitForRolloverReadyStep.java:126) [x-pack-core-7.13.2.jar:7.13.2]
	at org.elasticsearch.xpack.ilm.IndexLifecycleRunner.runPeriodicStep(IndexLifecycleRunner.java:176) [x-pack-ilm-7.13.2.jar:7.13.2]
	at org.elasticsearch.xpack.ilm.IndexLifecycleService.triggerPolicies(IndexLifecycleService.java:333) [x-pack-ilm-7.13.2.jar:7.13.2]
	at org.elasticsearch.xpack.ilm.IndexLifecycleService.triggered(IndexLifecycleService.java:271) [x-pack-ilm-7.13.2.jar:7.13.2]
	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine.notifyListeners(SchedulerEngine.java:184) [x-pack-core-7.13.2.jar:7.13.2]
	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.run(SchedulerEngine.java:217) [x-pack-core-7.13.2.jar:7.13.2]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:831) [?:?]
[2021-06-29T19:52:03,290][INFO ][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [16]

Does somebody know what is the problem?

Can you curl Elasticsearch directly to see what response you get?

Hi, no Elastic does not run at all. Can not start. When I removed Logstash it is back. Obviously Logstash can block the Elastic. Which is not good. I think it is also in the log, but dont unferstand it. It seems it is something related to IML.

Showing more of the Elasticsearch log would be helpful, as there's nothing there that suggests Logstash is blocking it.

What about this?

[2021-06-29T19:32:03,219][INFO ][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [15]
[2021-06-29T19:42:03,212][ERROR][o.e.x.i.IndexLifecycleRunner] [vmi503579.contaboserver.net] policy [filebeat] for index [filebeat-7.13.2-2021.06.29] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step
java.lang.IllegalArgumentException: index.lifecycle.rollover_alias [filebeat-7.13.2] does not point to index [filebeat-7.13.2-2021.06.29]

Filebeat had a problem but everything from Filebeat should go through the Logstash to Elasticsearch.
Error log was full of this error but comment is restricted to 13 000 signs.

Those issues are around the index lifecycle management (ILM) policies not being able to roll over as the alias is missing. We have documentation around this on Tutorial: Automate rollover with ILM | Elasticsearch Guide [7.13] | Elastic

I would recommend this blog which I helped cross check so that it includes all the necessary steps Optimizing costs in Elastic Cloud: Hot-warm + index lifecycle management | Elastic Blog

Apart from that your initially shared logs are full of timeouts and long gargbage collections.
It seems you have too few resources (or the data is not structured following best practices) and hence there are issues.

Thank you I am going to look at it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.