I try to configure an Elasticsearch cluster on Ubuntu 18.04 (on Azure). The basic installation is fine. Nevertheless when I try to configure the "network.host" value in the configuration file the Elasticsearch service does not start.
Error message:
systemd[1]: Started Elasticsearch.
-- Subject: Unit elasticsearch.service has finished start-up
-- Defined-By: systemd
-- Support: Enterprise open source support | Ubuntu
-- Unit elasticsearch.service has finished starting up.
-- The start-up result is RESULT.
elasticsearch[3955]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
The following entries don't work:
network.host: 0.0.0.0
network.host: eth0
network.host: 10.79.10.17
vm.max_map_count=524288 (it does not work neither with default nor with 262144)
By default it refuse the connection for static ip
curl http://10.79.10.17:9200
curl: (7) Failed to connect to 10.79.10.17 port 9200: Connection refused
When I try to configure network.host value the services does not start.
can you check the logfile in /var/log/elasticsearch that is named like your cluster name is configured? I assume, that a bootstrap check has failed and that the log file contains information how to fix this, but this is just an assumption for now.
I can see this in the log:
Entering safepoint region: GenCollectForAllocation
GC(2) Pause Young (Allocation Failure)
GC(2) Using 8 workers of 8 for evacuation
GC(2) Desired survivor size 17891328 bytes, new threshold 6 (max threshold 6)
GC(2) Age table with threshold 6 (max threshold 6)
GC(2) - age 1: 2505304 bytes, 2505304 total
GC(2) - age 2: 4066352 bytes, 6571656 total
GC(2) - age 3: 10031824 bytes, 16603480 total
GC(2) ParNew: 298211K->23277K(314560K)
GC(2) CMS: 0K->0K(699072K)
GC(2) Metaspace: 20810K->20810K(1069056K)
GC(2) Pause Young (Allocation Failure) 291M->22M(989M) 5.505ms
GC(2) User=0.04s Sys=0.00s Real=0.01s
Leaving safepoint region
Total time for which application threads were stopped: 0.0057126 seconds, Stopping threads took: 0.0000459 seconds
as written above, there should be a file, that is named like your cluster name, thus my-search.log in the log directory, please show the output of that one.
[2019-09-24T13:35:48,085][INFO ][o.e.b.BootstrapChecks ] [eslnx02] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-09-24T13:35:48,089][ERROR][o.e.b.Bootstrap ] [eslnx02] node validation exception
[3] bootstrap checks failed
[1]: initial heap size [2147483648] not equal to maximum heap size [4294967296]; this can cause resize pauses and prevents mlockall from locking the entire heap
[2]: memory locking requested for elasticsearch process but memory is not locked
[3]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
I don't think that's true, and I have tried and confirmed that there is no problem running Elasticsearch with network.host set and transport.host unset.
The issue you had was that there were bootstrap checks failing.
Thanks, I am wondering if there's a bug here. Do you see a line saying bound or publishing to a non-loopback address shortly after the first of these? I.e.:
[2019-09-26T13:04:25,325][INFO ][o.e.t.TransportService ] [node-0] publish_address {192.168.1.139:9302}, bound_addresses {192.168.1.139:9302}, {192.168.1.179:9302}
[2019-09-26T13:04:25,337][INFO ][o.e.b.BootstrapChecks ] [node-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
Would it be possible to share the whole log output from starting up the node through to seeing the first master node changed message?
Also, could you comment out just the transport.* lines in your config, restart the node and share the whole log output from starting up the node through to the whole message saying bootstrap checks failed?
[1] bootstrap checks failed
[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
This means one of them is required from discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes
Could you please send me the documentation where I can find this?
However I really would like to see the complete logs from your node, both from when it successfully starts up and when it fails. The excerpts you've shared are unfortunately not enough for us to work out whether there's a bug that needs fixing here.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.