Unable to start Elasticsearch 7.5 on RHEL 7

Hello,

I am having an issue running the elasticsearch service after installation. I have seen another article here which was not answered. I am running into the same issue as the other user. Here are some log files:

Pastebin of elasticsearch log:
https://pastebin.com/VhvdJmWE

Pastebin of (default) /etc/elasticsearch/elasticsearch.yml:
https://pastebin.com/X4zcRrzf

Please let me know if this is an issue related to Java which it seems via the log file. I have tried many things and none of them seem to be working.

Regards

I think this is the breaking error:

2020-01-07T19:53:52,623][WARN ][o.e.b.BootstrapChecks ] [MachineName] the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured

I think if you set node.name to a value, then use that value for discovery.seed_hosts it should start.

Hi Len,

Thank you for your response! I tried what you have mentioned but am still running into issues:

I updated the elasticsearch.yml file to define node.name as well as the discovery.seed_hosts. for node.name I tried both the default "node-1" statement as well as the same entry I used for discovery.seed_hosts (127.0.0.1) like so:

new config
as well as:
config 2

The discovery.seed_hosts is by default set to 127.0.0.1 which what I want but I defined it anyways as follows:

both failed with the output in the log as follows:

very strange that the service cannot start still...

Furthermore I tried installing this directly from the internet on another sandbox machine and it worked and started! The thing is for me to use this in a production environment it cannot be connected to the internet so my issue remains unresolved. I am thinking it might be installing some unmentioned dependencies when installing from the internet?? I am trying to install this locally from an .rpm file on a machine not connected to the internet so maybe its not installing those dependencies... not sure

Let me know if there is anything else I could try

Thanks!

FWIW for a one-node production cluster you can set discovery.seed_hosts: []. As long as you've set it to something Elasticsearch will be happy.

The log message stopping... indicates that Elasticsearch was instructed to shut down cleanly by something external. On Linux that means it got something like a SIGTERM; I'm not sure exactly what the Windows equivalent is.

you have to define this first time.

Oh wait we are on Linux here. So yes, something outside Elasticsearch itself sent it a SIGTERM.

It is perhaps no coincidence that there's almost exactly 30 seconds between the first log message and the stopping...:

[2020-01-07T19:52:39,361][INFO ][o.e.e.NodeEnvironment    ] [MachineName] using [1] data paths, mounts [[/var (/dev/mapper/rhel-var)]], net usable_space [31.8gb], net total_space [31.9gb], types [xfs]
...
[2020-01-07T19:53:09,640][INFO ][o.e.n.Node               ] [MachineName] stopping ...

Maybe whatever is starting Elasticsearch up is waiting for some indication that the process is healthy and giving up after 30 seconds?

Thank you for your responses guys! I have tried what you have mentioned and commented out the node.name field and setdiscovery.seed_hosts: [] as below:

Tried to start the service and the same thing showed in the log:

I am confused why a SIGTERM is being sent as I am using systemctl to start the service and nothing external should be stopping it. I have it running smooth on an identical machine just connected to the internet so I am confident that no external program will be trying to shut it down..

very strange. I will try to define the cluster.initial_master_nodes next to see if anything changes but it seems to be something else.

Regards

did you try to restart your linux system? may be some process has ran away.

yes sir! also tried defining cluster.initial_master_nodes to no avail. Maybe I hit some sort of bug?

by the cluster.initial_master_node is just one node entry

I think systemd does have a startup timeout. See if that's set to 30 seconds for this service and, if so, increase it to something more lenient.

You might check /var/log/messages or equivalent, maybe oom killer is getting it.

That wouldn't result in the process shutting down cleanly, as it is doing.

This setting is not (yet) relevant. Let's focus on keeping the process alive first. We can worry about config like that later.

david I still believe it is to do with setup. because I don't see anything is startup script about timeout

/usr/lib/systemd/system/elasticsearch.service

Stressesdsalmon, please post more log entry if you have it. since it started till it end.

Have you tried disabling SElinux? As David was mentioning tail journalctl -xe or /var/log/messages when you restart elasticsearch service to see if there is additional logging.

Thank you all for your input I really do appreciate it! It seems like the issue resides with the systemd timeout like David mentioned. I set the timeout to 160sec, restarted after a successful install and attempted to start the service with systemd again. It worked!! I am getting an active running service now. It is still very strange to me that a internet connected machine is able to spin up the service in a shorter time frame than an offline machine. Nevertheless I am very glad I was able to get it going with all your help.

Thanks again!

Cheers!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.