I can see this in the log:
Entering safepoint region: GenCollectForAllocation
GC(2) Pause Young (Allocation Failure)
GC(2) Using 8 workers of 8 for evacuation
GC(2) Desired survivor size 17891328 bytes, new threshold 6 (max threshold 6)
GC(2) Age table with threshold 6 (max threshold 6)
GC(2) - age 1: 2505304 bytes, 2505304 total
GC(2) - age 2: 4066352 bytes, 6571656 total
GC(2) - age 3: 10031824 bytes, 16603480 total
GC(2) ParNew: 298211K->23277K(314560K)
GC(2) CMS: 0K->0K(699072K)
GC(2) Metaspace: 20810K->20810K(1069056K)
GC(2) Pause Young (Allocation Failure) 291M->22M(989M) 5.505ms
GC(2) User=0.04s Sys=0.00s Real=0.01s
Leaving safepoint region
Total time for which application threads were stopped: 0.0057126 seconds, Stopping threads took: 0.0000459 seconds
as written above, there should be a file, that is named like your cluster name, thus my-search.log in the log directory, please show the output of that one.
[2019-09-24T13:35:48,085][INFO ][o.e.b.BootstrapChecks ] [eslnx02] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-09-24T13:35:48,089][ERROR][o.e.b.Bootstrap ] [eslnx02] node validation exception
[3] bootstrap checks failed
[1]: initial heap size [2147483648] not equal to maximum heap size [4294967296]; this can cause resize pauses and prevents mlockall from locking the entire heap
[2]: memory locking requested for elasticsearch process but memory is not locked
[3]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
I don't think that's true, and I have tried and confirmed that there is no problem running Elasticsearch with network.host set and transport.host unset.
The issue you had was that there were bootstrap checks failing.
Thanks, I am wondering if there's a bug here. Do you see a line saying bound or publishing to a non-loopback address shortly after the first of these? I.e.:
[2019-09-26T13:04:25,325][INFO ][o.e.t.TransportService ] [node-0] publish_address {192.168.1.139:9302}, bound_addresses {192.168.1.139:9302}, {192.168.1.179:9302}
[2019-09-26T13:04:25,337][INFO ][o.e.b.BootstrapChecks ] [node-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
Would it be possible to share the whole log output from starting up the node through to seeing the first master node changed message?
Also, could you comment out just the transport.* lines in your config, restart the node and share the whole log output from starting up the node through to the whole message saying bootstrap checks failed?
[1] bootstrap checks failed
[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
This means one of them is required from discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes
Could you please send me the documentation where I can find this?
However I really would like to see the complete logs from your node, both from when it successfully starts up and when it fails. The excerpts you've shared are unfortunately not enough for us to work out whether there's a bug that needs fixing here.
Nevertheless I am sure - to correct myself - the key here that fact if you would like to configure static (non-localhost) related settings at network.host you have to add one of them from discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes.
Yes, this is by design, and documented. The fix is simply to set one of these settings (e.g. discovery.seed_hosts: []). That's quite different from setting transport.host: _site_, which I think has no effect on whether that bootstrap check passes or not.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.