Hi,
I'm using elasticsearch in a multi-node cluster.
Here's the configuration file (elasticsearch.yml) from one of the two elasticsearch log master nodes (in the cluster I have also 6 elasticsearch log nodes):
# Network
network:
host: 0.0.0.0
publish_host: c4t19154.xxx.xxx
http.port: 9200
# Cluster / node name
cluster.name: "es-sa20-xxx-log-hdp-itg-h4"
node.name: "c4t19154.xxx.xxx"
# Node role
node.master: True
node.data: False
# Discovery
discovery.seed_hosts: ['c4t19156.xxx.xxx:9300', 'c4t19154.xxx.xxx:9300', 'c4t19145.xxx.xxx:9300', 'c4t19147.xxx.xxx:9300', 'c4t19867.xxx.xxx:9300', 'c4t19149.xxx.xxx:9300', 'c4t19150.xxx.xxx:9300', 'hc9t09679.xxx.xxx:9300']
cluster.initial_master_nodes: ['c4t19156.xxx.xxx', 'c4t19154.xxx.xxx']
# Memory - Performance
bootstrap.memory_lock: true
# Monitoring
xpack.monitoring.collection.enabled: true
xpack.monitoring.exporters:
csa_monitoring:
type: http
host: ['https://c9t24539.xxx.xxx:9200', 'https://c9t24540.xxx.xxx:9200', 'https://c9t24541.xxx.xxx:9200']
auth:
username: "elastic"
password: "changeme"
ssl:
certificate_authorities: [ "/usr/share/elasticsearch/config/XXX_ENT_Private_SSL_CA_bundle.crt" ]
# Security
xpack.security.enabled: true
xpack.security.audit.enabled: false
# TLS/SSL
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.key: "/usr/share/elasticsearch/config/sa-monitoring-itg-h4.xxx.xxx.key"
xpack.security.transport.ssl.certificate: "/usr/share/elasticsearch/config/sa-monitoring-itg-h4.xxx.xxx.crt"
xpack.security.transport.ssl.certificate_authorities: [ "/usr/share/elasticsearch/config/XXX_ENT_Private_SSL_CA_bundle.crt" ]
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.key: "/usr/share/elasticsearch/config/sa-monitoring-itg-h4.xxx.xxx.key"
xpack.security.http.ssl.certificate: "/usr/share/elasticsearch/config/sa-monitoring-itg-h4.xxx.xxx.crt"
xpack.security.http.ssl.certificate_authorities: [ "/usr/share/elasticsearch/config/XXX_ENT_Private_SSL_CA_bundle.crt" ]
# Authentication - Authorization
xpack.security.authc:
anonymous:
username: anonymous_user
roles: superuser
authz_exception: true
realms:
native.native1:
order: 0
# Watcher mail config
xpack.notification.email.account:
exchange_account:
profile: outlook
email_defaults:
from: 'elasticsearch-logs on hdp-itg-h4 <systemteams-tss-rnd@xxx.flowdock.com>'
smtp:
starttls.enable: true
host: smtp1.xxx.com
port: 25
#Enable OIDC token service
xpack.security.authc.token.enabled: true
##Create an OpenID Connect realm
xpack.security.authc.realms.oidc.uidoidc:
order: 1
rp.client_id: "sa-kibana-itg"
rp.response_type: code
rp.redirect_uri: "https://sa-monitoring-itg-h4.xxx.xxx/kibana-api/security/v1/oidc"
op.issuer: "https://login-itg.ext.xxx.com"
op.authorization_endpoint: "https://login-itg.ext.xxx.com/as/authorization.oauth2"
op.token_endpoint: "https://login-itg.ext.xxx.com/as/token.oauth2"
op.jwkset_path: "https://login-itg.ext.xxx.com/pf/JWKS"
op.userinfo_endpoint: "https://login-itg.ext.xxx.com/idp/userinfo.openid"
op.endsession_endpoint: "https://login-itg.ext.xxx.com/idp/startSLO.ping"
rp.post_logout_redirect_uri: "https://sa-monitoring-itg-h4.xxx.xxx/logged_out"
rp.signature_algorithm: HS256
claims.principal: uid
#claims.groups: "http://example.info/claims/groups"
I'm using the following ansible play to restart elasticsearch:
---
- name: Stop elasticsearch docker container
docker_container:
name: elasticsearch
image: "{{ it_dtr_host }}/sa20sre/elasticsearch"
state: stopped
ignore_errors: yes
- name: Add elasticsearch data dir
file:
path: "{{base_path}}/elasticsearch/data"
state: directory
mode: 0777
- name: Set Facts
include: set_facts.yml
- name: Start elasticsearch docker container
docker_container:
name: elasticsearch
image: "{{ it_dtr_host }}/sa20sre/elasticsearch:{{ elasticsearch_version }}"
state: started
ignore_errors: yes
After the last task (Start elasticsearch docker container), elasticsearch docker containers are started on all nodes but after aproximatelly 30 seconds all of them stop with the following message in the logs:
{"type": "server", "timestamp": "2021-04-08T13:39:58,777Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "publish_address {172.17.0.4:9300}, bound_addresses {0.0.0.0:9300}" }
{"type": "server", "timestamp": "2021-04-08T13:39:58,790Z", "level": "INFO", "component": "o.e.b.BootstrapChecks", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "bound or publishing to a non-loopback address, enforcing bootstrap checks" }
ERROR: [1] bootstrap checks failed
[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
{"type": "server", "timestamp": "2021-04-08T13:39:58,887Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "stopping ..." }
{"type": "server", "timestamp": "2021-04-08T13:39:59,076Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "stopped" }
{"type": "server", "timestamp": "2021-04-08T13:39:59,077Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "closing ..." }
{"type": "server", "timestamp": "2021-04-08T13:39:59,217Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "0d7781abc420", "message": "closed" }
The log file says that the default discovery settings are unsuitable for production use and that at least one of discovery.seed_hosts, discovery.seed_providers and cluster.initial_master_nodes must be configured. As you can see from the configuration I have two of them configured (discovery.seed_hosts and cluster.initial_master_nodes).
Starting the elasticsearch manually doesn't help, elasticsearch starts and stops after a short period of time with the same error.
I've turned the net upside down trying to find something usefull but nothing I tried helped.
I've even tried to resolve this by setting es.enforce.bootstrap.checks to false in jvm.options file (although this is not a single node cluster) but it didn't help:
# avoid bootstrap checks
-Des.enforce.bootstrap.checks=false
I'll appreciate any help with this.