Elasticsearch stops in minutes when restarted

lmqnam · September 5, 2019, 10:22am

When I restart Elasticsearch, it stops more than 10 minutes at "logger":"o.e.g.GatewayService", "log":"recovered [0] indices into cluster_state"

I don't know the root cause. Please help me. Thanks for your time!

{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:16.669Z","logger":"o.e.p.PluginsService","timezone":"UTC","marker":"[es_node] ","log":"loaded module [transport-netty4]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:16.669Z","logger":"o.e.p.PluginsService","timezone":"UTC","marker":"[es_node] ","log":"loaded module [tribe]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:16.669Z","logger":"o.e.p.PluginsService","timezone":"UTC","marker":"[es_node] ","log":"loaded plugin [ingest-attachment]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:16.670Z","logger":"o.e.p.PluginsService","timezone":"UTC","marker":"[es_node] ","log":"loaded plugin [ingest-geoip]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:16.670Z","logger":"o.e.p.PluginsService","timezone":"UTC","marker":"[es_node] ","log":"loaded plugin [ingest-user-agent]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:19.480Z","logger":"o.e.d.c.s.Settings","timezone":"UTC","marker":"[es_node] ","log":"[http.enabled] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version."}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:20.487Z","logger":"o.e.d.DiscoveryModule","timezone":"UTC","marker":"[es_node] ","log":"using discovery type [zen] and host providers [settings]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:21.204Z","logger":"o.e.n.Node","timezone":"UTC","marker":"[es_node] ","log":"initialized"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:21.204Z","logger":"o.e.n.Node","timezone":"UTC","marker":"[es_node] ","log":"starting ..."}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:21.357Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[es_node] ","log":"publish_address {172.17.0.15:9300}, bound_addresses {0.0.0.0:9300}"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:21.371Z","logger":"o.e.b.BootstrapChecks","timezone":"UTC","marker":"[es_node] ","log":"bound or publishing to a non-loopback address, enforcing bootstrap checks"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:24.444Z","logger":"o.e.c.s.MasterService","timezone":"UTC","marker":"[es_node] ","log":"zen-disco-elected-as-master ([0] nodes joined), reason: new_master {es_node}{B7xQoWLSQ2KV5vYUrJFRlA}{YwMOe7tQRu6qbh9xRZsCfg}{172.17.0.15}{172.17.0.15:9300}"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:24.454Z","logger":"o.e.c.s.ClusterApplierService","timezone":"UTC","marker":"[es_node] ","log":"new_master {es_node}{B7xQoWLSQ2KV5vYUrJFRlA}{YwMOe7tQRu6qbh9xRZsCfg}{172.17.0.15}{172.17.0.15:9300}, reason: apply cluster state (from master [master {es_node}{B7xQoWLSQ2KV5vYUrJFRlA}{YwMOe7tQRu6qbh9xRZsCfg}{172.17.0.15}{172.17.0.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:24.488Z","logger":"o.e.h.n.Netty4HttpServerTransport","timezone":"UTC","marker":"[es_node] ","log":"publish_address {172.17.0.15:9200}, bound_addresses {0.0.0.0:9200}"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:33:24.488Z","logger":"o.e.n.Node","timezone":"UTC","marker":"[es_node] ","log":"started"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T**09:33:24**.500Z","logger":"o.e.g.GatewayService","timezone":"UTC","marker":"[es_node] ","log":"recovered [0] indices into cluster_state"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T**09:48:36**.955Z","logger":"o.e.d.c.m.MetaDataCreateIndexService","timezone":"UTC","marker":"[es_node] ","log":"the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:48:37.046Z","logger":"o.e.c.m.MetaDataCreateIndexService","timezone":"UTC","marker":"[es_node] ","log":"[logstash-2019.09.05] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:48:37.539Z","logger":"o.e.c.m.MetaDataMappingService","timezone":"UTC","marker":"[es_node] ","log":"[logstash-2019.09.05/Ie4GTU_UQ0KGvGYH2qdsag] create_mapping [fluentd]"}
{"type":"log","host":"local-altiplano-elasticsearch-0","level":"INFO","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2019-09-05T09:48:38.350Z","logger":"o.e.c.m.MetaDataMappingService","timezone":"UTC","marker":"[es_node] ","log":"[logstash-2019.09.05/Ie4GTU_UQ0KGvGYH2qdsag] update_mapping [fluentd]"}

msarwar · September 6, 2019, 2:24am

Can you share your elasticsearch.yml please? And any other settings you are using?

lmqnam · September 6, 2019, 2:34am

The Elasticsearch.yml is the following:
version: '3.4'
networks:
esnet:
external: true
services:
kibana:
image: "${ARTIFACTORY_IP}:9000/elk_k:${KIBANA_VERSION}"
container_name: anv-kibana
restart: always
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_URL=http://elasticsearch:9200
- TZ=UTC
depends_on:
- elasticsearch
networks:
- esnet
elasticsearch:
image: "${ARTIFACTORY_IP}:9000/elk_e:${ES_VERSION}"
container_name: anv-elasticsearch
restart: always
ports:
- "9200:9200"
volumes:
- /fnms/data_backup:/data/elasticsearch_backup
- /fnms/config:/etc/elasticsearch/config
environment:
- ES_JAVA_OPTS=-Xms26g -Xmx26g
- TZ=UTC
networks:
- esnet

The .env is the following:
ARTIFACTORY_IP=10.131.69.114
KIBANA_VERSION=6.5.4-19.03.01
ES_VERSION=6.5.4-19.01.0

The issue only occurs when restarting Elasticsearch. Please advise. Thank you!

lmqnam · September 6, 2019, 2:54am

Actually, the issue doesn't always appear, it just usually occurs.

msarwar · September 6, 2019, 6:34am

Given that you don't make any changes to your config and If the error is not consistent then its hard to get down to the issue. You can turn on the DEBUG log to see the debug output. Also make sure to monitor disk and memory resources.

msarwar · September 6, 2019, 6:37am

This is your docker file not elasticsearch.yml

pjanzen · September 6, 2019, 7:23am

The log posted tells you Elasticsearch is started and running.
What do you get with curl -XGET http://elasticsearch:9200/

lmqnam · September 6, 2019, 7:31am

Yes, it's running. But it stops at o.e.g.GatewayService more than 10 minutes. This only occurs when it's restarted.

pjanzen · September 6, 2019, 7:34am

And when that happens, there is nothing else in the logs?

Are you able to logon to the machine and do a journal -xe or service elasticsearch status
I am not really familiar with docker images so that might not be possible...

lmqnam · September 6, 2019, 11:09am

Actually, when Elasticsearch is restarted, Kibana stops displaying logs. Then, Elasticsearch runs again and stops showing logs at o.e.g.GatewayService. Kibana only displays logs once Elasticsearch continues running MetaDataCreateIndexService and MetaDataMappingService. It takes more than 10 minutes for delaying. I don't know how to find the root cause.

lmqnam · September 6, 2019, 11:15am

Sorry. It's my mistakes. I uses docker to run Elasticsearch.
I think the image below is elasticsearch.yml

lmqnam · September 6, 2019, 11:23am

Thanks for your advice. I tried, but it resets to info automatically when Elasticsearch is restarted. Do you have any ideas?

DavidTurner · September 6, 2019, 11:37am

I'm struggling to understand what the issue is here. This sounds like completely normal behaviour for Elasticsearch. It doesn't log much when it's running normally, and recovered [0] indices into cluster_state is usually the last line it logs when starting up.

Can you explain more clearly why you think this is a problem?

lmqnam · September 9, 2019, 2:15am

When it starts first, it runs and shows logs "logger":"o.e.c.m.MetaDataMappingService", "log":"[logstash-2019.09.05/Ie4GTU_UQ0KGvGYH2qdsag] update_mapping [fluentd]" as the last line without delay. But when restarted, it stops at "recovered [0] indices into cluster_state" log in the period time and then continue to show "update_mapping [fluentd]" log.

Hence I think this is a problem.

Christian_Dahlqvist · September 9, 2019, 5:14am

Are you using persistent storage for your data path /data/data?

lmqnam · September 9, 2019, 6:46am

I found out the root cause. It's due to the default 5s of request_timeout parameter.

The Fluentd process can get into this state when every attempt to write logs to an Elasticsearch instance takes longer than 5 seconds to complete. Until a consistent number of writes takes less than 5 seconds to complete, logging essentially stops working.

In my case, if request_timeout is set to 1m, it will be working fine.

Thank you so much for your time!

Refer:

https://bugzilla.redhat.com/show_bug.cgi?id=1497836

DavidTurner · September 9, 2019, 6:51am

Expanding on @Christian_Dahlqvist's question, the message indicates that there are no indices (i.e. no data at all) when the node restarts. It sounds like you previously inserted some data, so the question is why did it go away. The most likely explanation is that your data path is emptied when you restart your container.

I don't think anything is wrong with Elasticsearch here (it's looks like it's working normally and waiting for data) but maybe the process that you expect to send it data is waiting to find the index it created before the node restarted; after 10 minutes perhaps it gives up and creates a new index.

lmqnam · September 9, 2019, 6:56am

Yes, you're right. The root cause is the configuration for Fluentd.

system · October 7, 2019, 7:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Timeout's when we restart logstash Elasticsearch	1	612	July 5, 2017
Elasticsearch randomly restarts Elasticsearch	5	484	July 6, 2017
Elasticsearch is taking 2 hours to resart Elasticsearch	11	1357	July 5, 2017
Interesting Issue Elasticsearch	7	436	July 6, 2017
Elasticsearch stopped itself from time to time Elasticsearch	4	886	July 6, 2017

Elasticsearch stops in minutes when restarted

Related topics