Elasticsearch 5.5 is taking more than 4 hours to start in CentOS 7.x

Hi All,

I used Elasticsearch 5.5.2 on CentOS 6.9. It was working fine.
Now i have upgraded my OS to CentOS 7.5. Now it is taking around 4 to 5 hours to enable 9200 port. Logs also not updating till that time.

Process starts immediately. But to update logs and to enable port (9200), it is taking more than 4 hours.
I have enough resource in the system. I have tried by giving -Xms16g -Xmx16g also. But no luck. Please let me know how to fix this issue.

Process started around 12AM and logs started updating by 5AM.

bash-4.2$ free -g
total used free shared buff/cache available
Mem: 31 2 25 0 2 27
Swap: 31 0 31
bash-4.2$

bash-4.2$ ulimit -u
2048
bash-4.2$ ulimit -n
65536
bash-4.2$ grep max /etc/sysctl.conf
vm.max_map_count=262144
bash-4.2$

bash-4.2$ ps -eaf | grep elastic
gmapadm+ 4281 1 10 Oct09 ? 05:06:57 /local/mnt/java/java8/bin/java -Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError -Des.path.home=/local/mnt/elastic/elasticsearch-5.5.2 -cp /local/mnt/elastic/elasticsearch-5.5.2/lib/* org.elasticsearch.bootstrap.Elasticsearch -Epath.data=/local/mnt/elastic/data -Enode.name=rvasanth001.qualcomm.com -Ecluster.name=rvasanth-es -Epath.logs=/local/mnt/elastic/logs -Enetwork.host=site -Ebootstrap.system_call_filter=false -Ediscovery.zen.ping.unicast.hosts=rvasanth001.qualcomm.com,rvasanth002.qualcomm.com,rvasanth003.qualcomm.com -Enode.master=true -Enode.data=true -Enode.ingest=true -d
gmapadm+ 32559 32339 0 23:24 pts/0 00:00:00 grep elastic

bash-4.2$ head /local/mnt/elastic/logs/rvasanth-es.log
[2018-10-09T05:09:36,947][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] initializing ...

bash-4.2$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch:trialuse-4.1-amd64:trialuse-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.5.1804 (Core)
Release: 7.5.1804
Codename: Core

Please help me to fix this issue. And let me know if any more details required.

What do the logs show?

Logs looks good. And service also started. But it is taking almost 5 hours to start logging and to enable port. Please find the below complete log.

bash-4.2$ cat /local/mnt/elastic/logs/rvasanth-es.log
[2018-10-09T05:09:36,947][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] initializing ...
[2018-10-09T05:09:37,164][INFO ][o.e.e.NodeEnvironment ] [rvasanth001.qualcomm.com] using [1] data paths, mounts [[/local/mnt (/dev/mapper/vg00-lv_local_mnt)]], net usable_space [45.2gb], net total_space [51.2gb], spins? [possibly], types [ext4]
[2018-10-09T05:09:37,164][INFO ][o.e.e.NodeEnvironment ] [rvasanth001.qualcomm.com] heap size [1.9gb], compressed ordinary object pointers [true]
[2018-10-09T05:09:37,165][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] node name [rvasanth001.qualcomm.com], node ID [RGvFeBqJRbWVQCKkHkIvYg]
[2018-10-09T05:09:37,166][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] version[5.5.2], pid[4281], build[b2f0c09/2017-08-14T12:33:14.154Z], OS[Linux/3.10.0-862.11.6.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_77/25.77-b03]
[2018-10-09T05:09:37,166][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/local/mnt/elastic/elasticsearch-5.5.2]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [aggs-matrix-stats]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [ingest-common]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [lang-expression]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [lang-groovy]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [lang-mustache]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [lang-painless]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [parent-join]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [percolator]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [reindex]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [transport-netty3]
[2018-10-09T05:09:38,037][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] loaded module [transport-netty4]
[2018-10-09T05:09:38,038][INFO ][o.e.p.PluginsService ] [rvasanth001.qualcomm.com] no plugins loaded
[2018-10-09T05:09:39,464][INFO ][o.e.d.DiscoveryModule ] [rvasanth001.qualcomm.com] using discovery type [zen]
[2018-10-09T05:09:40,032][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] initialized
[2018-10-09T05:09:40,032][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] starting ...
[2018-10-09T05:09:40,180][INFO ][o.e.t.TransportService ] [rvasanth001.qualcomm.com] publish_address {10.49.160.112:9300}, bound_addresses {10.49.160.112:9300}
[2018-10-09T05:09:40,190][INFO ][o.e.b.BootstrapChecks ] [rvasanth001.qualcomm.com] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2018-10-09T05:09:43,237][INFO ][o.e.c.s.ClusterService ] [rvasanth001.qualcomm.com] new_master {rvasanth001.qualcomm.com}{RGvFeBqJRbWVQCKkHkIvYg}{btUYfuRiQJKQ9MUZbBPZVQ}{10.49.160.112}{10.49.160.112:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2018-10-09T05:09:43,254][INFO ][o.e.h.n.Netty4HttpServerTransport] [rvasanth001.qualcomm.com] publish_address {10.49.160.112:9200}, bound_addresses {10.49.160.112:9200}
[2018-10-09T05:09:43,254][INFO ][o.e.n.Node ] [rvasanth001.qualcomm.com] started
[2018-10-09T05:09:43,273][INFO ][o.e.g.GatewayService ] [rvasanth001.qualcomm.com] recovered [0] indices into cluster_state
[2018-10-09T05:15:24,403][INFO ][o.e.c.s.ClusterService ] [rvasanth001.qualcomm.com] added {{rvasanth001.qualcomm.com}{oh3jEGfaQdC4c43NMBPidA}{qcb5bMu3RsOUbYXY784qIw}{10.49.160.129}{10.49.160.129:9300},}, reason: zen-disco-node-join[{rvasanth001.qualcomm.com}{oh3jEGfaQdC4c43NMBPidA}{qcb5bMu3RsOUbYXY784qIw}{10.49.160.129}{10.49.160.129:9300}]
[2018-10-09T05:15:24,525][WARN ][o.e.d.z.ElectMasterService] [rvasanth001.qualcomm.com] value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
[2018-10-09T05:19:45,186][INFO ][o.e.c.s.ClusterService ] [rvasanth001.qualcomm.com] added {{rvasanth001.qualcomm.com}{yFM4UkJGSAybOK1W43Bv3Q}{MGUG_6O8S0SXPkBZ6-oq5Q}{10.49.160.170}{10.49.160.170:9300},}, reason: zen-disco-node-join[{rvasanth001.qualcomm.com}{yFM4UkJGSAybOK1W43Bv3Q}{MGUG_6O8S0SXPkBZ6-oq5Q}{10.49.160.170}{10.49.160.170:9300}]
bash-4.2$

That doesn't align with what the log shows. That's only 9 or so seconds.

Thanks @warkolm for the reply.

Can you please observe the process start time and the logging start time. Below i have given the process start time.

process started @ Oct 9 00:09:12 2018
Logging started @ 2018-10-09T05:09:36

There is 5 hours difference between process start time and log generation time. So, It means it took 5 hours to enable 9200 port.

bash-4.2$ ps -eo pid,lstart,cmd | grep elastic
4281 Tue Oct 9 00:09:12 2018 /local/mnt/java/java8/bin/java -Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError -Des.path.home=/local/mnt/elastic/elasticsearch-5.5.2 -cp /local/mnt/elastic/elasticsearch-5.5.2/lib/* org.elasticsearch.bootstrap.Elasticsearch -Epath.data=/local/mnt/elastic/data -Enode.name=rvasanth001.qualcomm.com -Ecluster.name=rvasanth-es -Epath.logs=/local/mnt/elastic/logs -Enetwork.host=site -Ebootstrap.system_call_filter=false -Ediscovery.zen.ping.unicast.hosts=rvasanth001.qualcomm.com,rvasanth002.qualcomm.com,rvasanth003.qualcomm.com -Enode.master=true -Enode.data=true -Enode.ingest=true -d
20400 Thu Oct 11 01:03:43 2018 grep elastic
bash-4.2$

What timezone does the server have set?

It is on PDT/PST

Then it's likely the reason you are seeing the difference in times there.

Try stopping Elasticsearch, starting it and then setting up a loop to curl port 9200, then see how long that takes.

Thanks @warkolm for the reply.

Sorry for asking like this.

[Q]What is the relation with timezone? My Server is in PDT/PST timezone. And my elastic process also running in same timezone, My logs also generating in same timezone. And i am also checking the time difference in same timezone.

I hope it is not the case. Because all the activities are happening in same timezone.

I have restarted may times. Every time is the same situation. Taking 4 to 5 hours.

Anyone had simile issue?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.