Kibana lost connection to Elasticsearch after some time

I'm following this guide to run Elasticsearch and Kibana on Docker. Below is my docker-compose.yml file:

version: "3.7"
services:
  es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    container_name: es01
    ports:
      - "9200:9200"
      - "9300:9300"
    volumes:
      - $HOME/elasticsearch/data:/usr/share/elasticsearch/data
    environment:
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - discovery.type=single-node
      - TAKE_FILE_OWNERSHIP=true
    ulimits:
      memlock:
        soft: -1
        hard: -1
    networks:
      - elastic

  kibana:
    image: docker.elastic.co/kibana/kibana:7.6.2
    container_name: kibana
    ports:
      - "5601:5601"
    environment:
      ELASTICSEARCH_URL: http://es01:9200
      ELASTICSEARCH_HOSTS: http://es01:9200
      SERVER_HOST: 0.0.0.0
    networks:
      - elastic

networks:
  elastic:
    driver: bridge

Everything works perfectly at the beginning. After around 10 minutes, Kibana does not work anymore (Elasticsearch still works fine). When I check the log, I see this message everywhere.

Error: Request Timeout after 30000ms

Accessing the webpage gives me this:

{
  statusCode: 503,
  error: "Service Unavailable",
  message: "Request Timeout after 30000ms"
}

It seems that the connection between Kibana and Elasticsearch was broken after 10 minutes. Restarting doesn't help.

Why do I have this strange problem? How can I fix it?

Thanks for your help.

Hey @tdoan,

If you enable debug logging for Kibana (LOGGING_VERBOSE: true), does that give you any more information about why this might be happening?

Does this 10-minute delay happen regardless of your activity in Kibana? In other words, does an idle Kibana still start failing after 10 minutes?

Hi @Larry_Gregory,

Thank you for your quick reply.
When I check the log, it said that the connection to Elasticsearch was timeout. I will try setting the LOGGING_VERBOSE and see if I get any more information.

The 10-minute delay happens no matter what I do with Kibana. Strangely, if I start Kibana (Docker) from another machine, everything works fine. But on the same machine like my current setup, the problem occurs.

The log doesn't show anything new. It looks somehow like this:

Could not create APM Agent configuration: Request Timeout after 30000ms
{"type":"log","@timestamp":"2020-05-05T12:11:26Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to Error: Request Timeout after 30000ms error"}
{"type":"log","@timestamp":"2020-05-05T12:11:26Z","tags":["error","savedobjects-service"],"pid":6,"message":"Unable to retrieve version information from Elasticsearch nodes."}
{"type":"log","@timestamp":"2020-05-05T12:11:56Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to Error: Request Timeout after 30000ms error"}
{"type":"log","@timestamp":"2020-05-05T12:12:26Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to Error: Request Timeout after 30000ms error"}
{"type":"log","@timestamp":"2020-05-05T12:12:56Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to Error: Request Timeout after 30000ms error"}
{"type":"log","@timestamp":"2020-05-05T12:13:26Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to Error: Request Timeout after 30000ms error"}

I can confirm that my basic license is active when I check http://my-es-server:9200/_xpack

That's interesting, I don't recall seeing this in the past. When you're testing your connection to ES, are you doing so within Kibana's docker container, or from outside that container?

Can you make a request to ES's /_cluster/health API endpoint once Kibana stops responding, and report that result back here?

Requesting to /_cluster/health gives me this:

{
  "cluster_name" : "es-docker-cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 6,
  "active_shards" : 6,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 3,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 66.66666666666666
}

I checked the connection to ES outside of Kibana's docker container. I don't know how to test it inside the container since the curl command is not available there.

Thanks @tdoan. I'll take a closer look over the next couple of days to try to reproduce this on my end, because nothing seems out of place here

Using the docker-compose example you provided, I'm not able to reproduce this. I've had it running for about 30 minutes now, and there are no errors or sign of a crash. Any other information you can share to help diagnose?

After some further investigation, it turned out that the problem was caused by Puppet. Somehow Puppet interfered with the Docker network. When Puppet was shut down, everything works fine again.

Thank you for your help :smiley: