I deployed elasticsearch into kubernetes cluster with 3 nodes, and the OS of each node is Oracle Linux 7.3. I started 3 elasticsearch instances (actually a POD in kubernetes cluster); one of them was elected as master node soon, and two other nodes could join the cluster successfully. But the issues were:
[Issue 1]. I saw the following log messages on either of the two non-master instances,
[2018-02-06T08:44:59,220][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{elasticsearch-1-default}{KSMtEBd6S1GB5f8pe1AXuw}{x1Q1gL--Rbif7wLkuTR1_w}{192.168.21.31}{192.168.21.31:9300}{ml.machine_memory=14439243776, ml.max_open_jobs=20, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[Issue 2]. On the master instance, there were always a couple of exceptions (including a NullPointerException) in the beginning,
[2018-02-06T08:44:56,449][ERROR][o.e.x.m.c.i.IndexStatsCollector] [elasticsearch-1-default] collector [index-stats] failed to collect data
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:165) ~[elasticsearch-6.1.3.jar:6.1.3]
......
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
[2018-02-06T08:44:56,547][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [elasticsearch-1-default] collector [cluster_stats] failed to collect data
java.lang.NullPointerException: null
at org.elasticsearch.xpack.monitoring.collector.cluster.ClusterStatsCollector.doCollect(ClusterStatsCollector.java:119) ~[x-pack-6.1.3.jar:6.1.3]
at org.elasticsearch.xpack.monitoring.collector.Collector.collect(Collector.java:100) [x-pack-6.1.3.jar:6.1.3]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:222) [x-pack-6.1.3.jar:6.1.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.1.3.jar:6.1.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_161]
......
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
But it seemed that Elasticsearch fixed the above "exceptions" automatically, because I saw the following log message,
[2018-02-06T08:45:04,561][INFO ][o.e.c.r.a.AllocationService] [elasticsearch-1-default] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-es-6-2018.02.06][0]] ...]).
[Issue 3]. When I tried to query the elasticsearch using the following command "curl --user elastic:Changeme123 http://<servier_ip>:9200?pretty", I got the following response. Note that I configured the environment variable "ELASTIC_PASSWORD" with value "Changeme123".
{ "error" : { "root_cause" : [ { "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm=\"security\" charset=\"UTF-8\"" } } ], "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm=\"security\" charset=\"UTF-8\"" } }, "status" : 401 }
The related information is as below...
[info 1] I built a custom image using the following Dockerfile,
FROM docker.elastic.co/elasticsearch/elasticsearch-platinum:6.1.3 WORKDIR /usr/share/elasticsearch USER root COPY custom-entrypoint bin/ COPY elasticsearch.yml config/ COPY log4j2.properties config/ COPY certs/elastic-certificates.p12 config/ RUN chown elasticsearch:elasticsearch config/elasticsearch.yml config/log4j2.properties config/elastic-certificates.p12 bin/custom-entrypoint && \ chmod 0644 config/elastic-certificates.p12 && chmod 0750 bin/custom-entrypoint CMD ["/bin/bash", "bin/custom-entrypoint"]
[info 2] The start script custom-entrypoint is as below,
#!/bin/bash set -e ulimit -n 65536 ulimit -u 4096 ulimit -l unlimited su elasticsearch bin/elasticsearch ${1+"$@"}
[info 3] The configuration file elasticsearch.yml is as below,
cluster.name: "elasticsearch-${NAMESPACE}"
node.name: "${POD_NAME}-${NAMESPACE}"
network.host: ${POD_IP}
discovery.zen.ping.unicast.hosts: es-discovery-svc
discovery.zen.minimum_master_nodes: 2
bootstrap.memory_lock: true
gateway.recover_after_nodes: 2
gateway.expected_nodes: 3
gateway.recover_after_time: 5m
discovery.zen.fd.ping_interval: 1s
discovery.zen.fd.ping_timeout: 10s
discovery.zen.fd.ping_retries: 2
xpack.ssl.keystore.path: /usr/share/elasticsearch/config/elastic-certificates.p12
xpack.ssl.truststore.path: /usr/share/elasticsearch/config/elastic-certificates.p12
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
[Info 4] I generated the certificate per the following guide. All Elasticsearch instances share the same certificate.
Please let me know if you need any other inputs. Thanks.