Hi all,
We have our code hosted in GitHub, and we utilize GitHub workflows for our CI/CD pipeline. On each push in a feature branch, based on a docker compose file , in which we describe two services, two services are started and some tests are running. One of the two services is elasticsearch, and we use the image docker.elastic.co/elasticsearch/elasticsearch:8.5.3. The other depends on Elasticsearch service and starts only when the Elasticsearch service is healthy. Below you can see the description of elasticsearch
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.5.3
environment:
discovery.type: single-node
ELASTIC_PASSWORD: ******
ELASTIC_USERNAME: *******
xpack.security.enabled: "false"
xpack.security.enrollment.enabled: "false"
xpack.security.http.ssl.enabled: "false"
xpack.security.transport.ssl.enabled: "false"
action.destructive_requires_name: "false"
healthcheck:
test: curl -s http://elasticsearch:9200 >/dev/null || exit 1
interval: 10s
timeout: 5s
retries: 10
ports:
- 9200:9200
ulimits:
memlock:
soft: -1
hard: -1
On each push in Github those two service are up and running, and some tests that have been written using the pytest framework are running. In some of them we access the Elasticsearch and we do some stuff.
Until very recently, almost a week ago, everything was ok, the tests were running and passing. Out of the blue, we have started experiencing timeout errors, and essentially are tests are failing because of this.
failed on setup with "elastic_transport.ConnectionTimeout: Connection timeout caused by: ConnectionTimeout(Connection timeout caused by: ReadTimeoutError(HTTPConnectionPool(host='elasticsearch', port=9200): Read timed out. (read timeout=30)))"
We increased gradually the timeout from 10 to 30, but nothing.
Have anyone else experienced such a problem in a similar setup (GitHub, Docker-Compose, Elastisearch, etc.) ?
Do you have any idea on how we can find the real underlying reason for this ?
Any suggestions more than welcome
Thanks !!