Been working for a while and now

Hi All,

I have been using elasticsearch with success for a while now, however over the last two days I found that the service is stopping.

I have posted log files here after searching around for the discussion threads to see what it could be.

I would appreciate any help.

/etc/init.d/elasticsearch status
● elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
Active: failed (Result: signal) since Sun 2017-11-05 07:36:01 GMT; 17min ago
Docs: http://www.elastic.co
Main PID: 7967 (code=killed, signal=KILL)

Nov 05 07:03:30 server systemd[1]: Starting Elasticsearch...
Nov 05 07:03:30 server systemd[1]: Started Elasticsearch.
Nov 05 07:36:01 server systemd[1]: elasticsearch.service: main process exited, code=killed, status=9/KILL
Nov 05 07:36:01 server systemd[1]: Unit elasticsearch.service entered failed state.
Nov 05 07:36:01 server systemd[1]: elasticsearch.service failed.

and from logs:

[2017-11-05T07:03:36,852][INFO ][o.e.n.Node ] [] initializing ...
[2017-11-05T07:03:37,239][INFO ][o.e.e.NodeEnvironment ] [rGV0GZQ] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [29.7gb], net total_space [61gb], spins? [unknown], types [rootfs]
[2017-11-05T07:03:37,241][INFO ][o.e.e.NodeEnvironment ] [rGV0GZQ] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-11-05T07:03:37,302][INFO ][o.e.n.Node ] node name [rGV0GZQ] derived from node ID [rGV0GZQvQRa2o1ptLcHjjw]; set [node.name] to override
[2017-11-05T07:03:37,303][INFO ][o.e.n.Node ] version[5.6.3], pid[7967], build[1a2f265/2017-10-06T20:33:39.012Z], OS[Linux/3.10.0-693.2.2.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_151/25.151-b12]
[2017-11-05T07:03:37,303][INFO ][o.e.n.Node ] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/usr/share/elasticsearch]
[2017-11-05T07:03:39,814][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [aggs-matrix-stats]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [ingest-common]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [lang-expression]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [lang-groovy]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [lang-mustache]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [lang-painless]
[2017-11-05T07:03:39,817][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [parent-join]
[2017-11-05T07:03:39,818][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [percolator]
[2017-11-05T07:03:39,818][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [reindex]
[2017-11-05T07:03:39,818][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [transport-netty3]
[2017-11-05T07:03:39,818][INFO ][o.e.p.PluginsService ] [rGV0GZQ] loaded module [transport-netty4]
[2017-11-05T07:03:39,819][INFO ][o.e.p.PluginsService ] [rGV0GZQ] no plugins loaded
[2017-11-05T07:03:44,372][INFO ][o.e.d.DiscoveryModule ] [rGV0GZQ] using discovery type [zen]
[2017-11-05T07:03:46,118][INFO ][o.e.n.Node ] initialized
[2017-11-05T07:03:46,119][INFO ][o.e.n.Node ] [rGV0GZQ] starting ...
[2017-11-05T07:03:46,796][INFO ][o.e.t.TransportService ] [rGV0GZQ] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-11-05T07:03:49,986][INFO ][o.e.c.s.ClusterService ] [rGV0GZQ] new_master {rGV0GZQ}{rGV0GZQvQRa2o1ptLcHjjw}{3LA2YfquSki5MMGjl9JPHw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-11-05T07:03:50,050][INFO ][o.e.h.n.Netty4HttpServerTransport] [rGV0GZQ] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[2017-11-05T07:03:50,050][INFO ][o.e.n.Node ] [rGV0GZQ] started
[2017-11-05T07:03:50,495][INFO ][o.e.g.GatewayService ] [rGV0GZQ] recovered [1] indices into cluster_state
[2017-11-05T07:03:54,513][INFO ][o.e.c.r.a.AllocationService] [rGV0GZQ] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[server_xf][2], [server_xf][4]] ...]).

Is there any more to the log?

nothing more in elastisearch log but;

I can see the system log on the server that this first happened on the 1st November 2017

Nov 1 11:07:32 server kernel: Out of memory: Kill process 27558 (java) score 515 or sacrifice child
Nov 1 11:07:32 server kernel: Killed process 27558 (java) total-vm:8038140kB, anon-rss:1735924kB, file-rss:0kB, shmem-rss:0kB
Nov 1 11:07:33 server systemd: elasticsearch.service: main process exited, code=killed, status=9/KILL
Nov 1 11:07:33 server systemd: Unit elasticsearch.service entered failed state.

Looks like your OSs OOM killer and not Elasticsearch. You will need to figure out how to configure that based on your OS.

Yes that's the common message right before a service failure.

Im using centos - isnt the OOM killer designed to protect the system? Why would it be killing elasticsearch?

I do have excellent support here in the UK from my hosting company - I will ask them for some advice also.

OOM Killer can protect the system, though sometimes it can cause problems with processes like this :slight_smile:

all sorted now. Memory boosted by my hosts.