I am using ELK stack (Elasticsearch and Kibana ) in a docker setup to monitor one of our application which is running in docker. This is a dedicated machine. I use filebeats to send logs to the ELK. Filebeats also run in docker. Whenever there is a new deployment the filebeat docker container is taken down and starts up again ( running file beat setup, filebeat run ). It works for a week or so then the Kibana interface goes down. It happened like four times now.
Here is the logs from Elasticsearch. Once these exceptions are seen there are no logs from Kibana container.
elasticsearch | {"type": "server", "timestamp": "2020-09-01T09:11:31,191Z", "level": "DEBUG", "component": "o.e.a.s.TransportSearchAction", "cluster.name": "docker-cluster", "node.name": "elasticsearch", "message": "[.kibana_task_manager_1][0], node[jVcP3kfJSdG7RezbDK7-yg], [P], s[STARTED], a[id=WkViw7IdQsC2EVFcS1SeWQ]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, expand_wildcards_hidden=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false, ignore_throttled=true], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=null, allowPartialSearchResults=true, localClusterAlias=null, getOrCreateAbsoluteStartMillis=-1, ccsMinimizeRoundtrips=true, source={}}] lastShard [true]", "cluster.uuid": "tDA17sj0SxOuNyBSrmphBg", "node.id": "jVcP3kfJSdG7RezbDK7-yg" ,
elasticsearch | "stacktrace": ["org.elasticsearch.transport.TransportException: failure to send",
elasticsearch | "at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:660) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:704) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:696) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchTransportService.sendExecuteQuery(SearchTransportService.java:138) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction.executePhaseOnShard(SearchQueryThenFetchAsyncAction.java:79) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$3(AbstractSearchAsyncAction.java:231) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$PendingExecutions.tryRun(AbstractSearchAsyncAction.java:668) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$PendingExecutions.finishAndRunNext(AbstractSearchAsyncAction.java:662) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNext(AbstractSearchAsyncAction.java:640) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNext(AbstractSearchAsyncAction.java:632) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$000(AbstractSearchAsyncAction.java:68) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.innerOnResponse(AbstractSearchAsyncAction.java:238) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:45) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:29) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchExecutionStatsCollector.onResponse(SearchExecutionStatsCollector.java:68) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchExecutionStatsCollector.onResponse(SearchExecutionStatsCollector.java:36) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:54) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleResponse(SearchTransportService.java:394) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService$6.handleResponse(TransportService.java:633) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1163) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1241) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1221) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:47) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:30) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.search.SearchService.lambda$runAsync$0(SearchService.java:416) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:695) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
elasticsearch | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
elasticsearch | "at java.lang.Thread.run(Thread.java:832) [?:?]",
elasticsearch | "Caused by: org.elasticsearch.tasks.TaskCancelledException: The parent task was cancelled, shouldn't start any child tasks",
elasticsearch | "at org.elasticsearch.tasks.TaskManager$CancellableTaskHolder.registerChildNode(TaskManager.java:521) ~[elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.tasks.TaskManager.registerChildNode(TaskManager.java:201) ~[elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:627) ~[elasticsearch-7.8.1.jar:7.8.1]",
elasticsearch | "... 31 more"] }
elasticsearch | {"type": "server", "timestamp": "2020-09-01T09:11:31,194Z", "level": "DEBUG", "component": "o.e.a.a.c.n.t.c.TransportCancelTasksAction", "cluster.name": "docker-cluster", "node.name": "elasticsearch", "message": "Removing ban for the parent [jVcP3kfJSdG7RezbDK7-yg:405287] on the node [jVcP3kfJSdG7RezbDK7-yg]", "cluster.uuid": "tDA17sj0SxOuNyBSrmphBg", "node.id": "jVcP3kfJSdG7RezbDK7-yg" }
Here is the logs from filebeat container trying to start up.
2020-09-09T06:45:36.642Z INFO instance/beat.go:647 Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2020-09-09T06:45:36.655Z INFO instance/beat.go:655 Beat ID: 49e4b497-d5eb-49ab-bdb4-40bdb42d6ca8
2020-09-09T06:45:36.656Z INFO [beat] instance/beat.go:983 Beat info {"system_info": {"beat": {"path": {"config": "/usr/share/filebeat", "data": "/usr/share/filebeat/data", "home": "/usr/share/filebeat", "logs": "/usr/share/filebeat/logs"}, "type": "filebeat", "uuid": "49e4b497-d5eb-49ab-bdb4-40bdb42d6ca8"}}}
2020-09-09T06:45:36.656Z INFO [beat] instance/beat.go:992 Build info {"system_info": {"build": {"commit": "94f7632be5d56a7928595da79f4b829ffe123744", "libbeat": "7.8.1", "time": "2020-07-21T15:12:45.000Z", "version": "7.8.1"}}}
2020-09-09T06:45:36.656Z INFO [beat] instance/beat.go:995 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.13.10"}}}
2020-09-09T06:45:36.662Z INFO [beat] instance/beat.go:999 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-08-26T11:33:11Z","containerized":true,"name":"cc58258203f1","ip":["127.0.0.1/8","172.18.0.2/16"],"kernel_version":"4.15.0-112-generic","mac":["02:42:ac:12:00:02"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":8,"patch":2003,"codename":"Core"},"timezone":"UTC","timezone_offset_sec":0,"id":"1a018e03a49f4bfc904c69b0d6c08959"}}}
2020-09-09T06:45:36.663Z INFO [beat] instance/beat.go:1028 Process info {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":null,"effective":null,"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/filebeat", "exe": "/usr/share/filebeat/filebeat", "name": "filebeat", "pid": 1, "ppid": 0, "seccomp": {"mode":"filter","no_new_privs":false}, "start_time": "2020-09-09T06:45:35.930Z"}}}
2020-09-09T06:45:36.663Z INFO instance/beat.go:310 Setup Beat: filebeat; Version: 7.8.1
2020-09-09T06:45:36.664Z INFO [index-management] idxmgmt/std.go:184 Set output.elasticsearch.index to 'filebeat-7.8.1' as ILM is enabled.
2020-09-09T06:45:36.664Z INFO eslegclient/connection.go:99 elasticsearch url: http://10.200.160.5:9200
2020-09-09T06:45:36.665Z INFO [publisher] pipeline/module.go:113 Beat name: cc58258203f1
2020-09-09T06:45:36.670Z INFO eslegclient/connection.go:99 elasticsearch url: http://10.200.160.5:9200
2020-09-09T06:45:36.680Z INFO [esclientleg] eslegclient/connection.go:314 Attempting to connect to Elasticsearch version 7.8.1
Overwriting ILM policy is disabled. Set
setup.ilm.overwrite:true for enabling.
2020-09-09T06:45:36.696Z INFO [index-management] idxmgmt/std.go:261 Auto ILM enable success.
2020-09-09T06:45:36.698Z INFO [index-management.ilm] ilm/std.go:139 do not generate ilm policy: exists=true, overwrite=false
2020-09-09T06:45:36.698Z INFO [index-management] idxmgmt/std.go:274 ILM policy successfully loaded.
2020-09-09T06:45:36.698Z INFO [index-management] idxmgmt/std.go:407 Set setup.template.name to '{filebeat-7.8.1 {now/d}-000001}' as ILM is enabled.
2020-09-09T06:45:36.698Z INFO [index-management] idxmgmt/std.go:412 Set setup.template.pattern to 'filebeat-7.8.1-*' as ILM is enabled.
2020-09-09T06:45:36.698Z INFO [index-management] idxmgmt/std.go:446 Set settings.index.lifecycle.rollover_alias in template to {filebeat-7.8.1 {now/d}-000001} as ILM is enabled.
2020-09-09T06:45:36.699Z INFO [index-management] idxmgmt/std.go:450 Set settings.index.lifecycle.name in template to {filebeat {"policy":{"phases":{"hot":{"actions":{"rollover":{"max_age":"30d","max_size":"50gb"}}}}}}} as ILM is enabled.
2020-09-09T06:45:36.701Z INFO template/load.go:169 Existing template will be overwritten, as overwrite is enabled.
2020-09-09T06:45:37.081Z INFO [add_cloud_metadata] add_cloud_metadata/add_cloud_metadata.go:93 add_cloud_metadata: hosting provider type detected as openstack, metadata={"availability_zone":"zone-2","instance":{"id":"i-0001d713","name":"xxxxxxxxxxxxxxxxx.novalocal"},"machine":{"type":"R1-Generic-4"},"provider":"openstack"}
2020-09-09T06:45:37.116Z INFO template/load.go:109 Try loading template filebeat-7.8.1 to Elasticsearch
2020-09-09T06:45:37.313Z INFO template/load.go:101 template with name 'filebeat-7.8.1' loaded.
2020-09-09T06:45:37.313Z INFO [index-management] idxmgmt/std.go:298 Loaded index template.
2020-09-09T06:45:37.316Z ERROR instance/beat.go:958 Exiting: resource 'filebeat-7.8.1' exists, but it is not an alias
Exiting: resource 'filebeat-7.8.1' exists, but it is not an alias
Can someone tell me what is going on ?