Elasticsearch 5.1.1 keep dying after 20 minutes

Hi there,

I hope somebody can help me.

I have an instance of ELK in Ubuntu 16.04 (2GB of RAM, 30GB of HDD). I can setup visualisation, dashboard and all.

BUT, the elasticsearch instance keep dying after 20 minutes or so (Around 6000 row of input)

I've tried to add the memlock on /etc/security/limits.conf

elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited

still no luck.

Anyone can help me where can I start debugging things out? I've tried checking /var/log/elasticsearch/elasticsearch.log but nothing much on the "dying" part, I can only see that it started.

Cheers,

Nathan

Hi, How much of that 2GB RAM is allocated in the /etc/elasticsearch/jvm.options? ..elasticsearch instance keep dying.. What exactly are you experiencing on the operating system itself?

Cheers,

By dying you mean the process is killed or unresponsive?
Does it only fail when you are feeding it new docs?
Are you using any unusual plugins? (e.g. I remember reading Zookeeper can call System.exit when unhappy).

Plugins can not call System#exit, we have the permissions for that locked down now:


1 Like

Hi Jymit,

I've checked the /etc/elasticsearch/jvm.options it shown:

-Xms1g
-Xmx1g

is it enough?

The process of elasticsearch just stopped running after a few minutes (15-20minutes). The other processes like Kibana, Nginx, Logstash is still running fine.

Here is the screenshot of the Kibana:

Hi Mark,

CMIIW, but it seems to be killed:

● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendo
   Active: failed (Result: signal) since Tue 2017-01-03 05:01:55 UTC; 22h ago
     Docs: http://www.elastic.co
  Process: 1461 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DI
  Process: 1422 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-
 Main PID: 1461 (code=killed, signal=KILL)

Yes, it only fails when I am feeding new docs.

The plugin I have is just Timelion which was installed by default.

Since you're running Elasticsearch with a 1 GB heap on a machine with 2 GB of RAM, I suspect that your instance is being killed by the OS OOM killer. Check your kernel logs.

Hi Jason,

I checked my /var/log/kern.log:

[ 1295.629750] node invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
[ 1295.629876]  [<ffffffff81192722>] oom_kill_process+0x202/0x3c0
[ 1295.630149] Out of memory: Kill process 1461 (java) score 679 or sacrifice child
[ 1295.631466] Killed process 1461 (java) total-vm:3642564kB, anon-rss:1367924kB, file-rss:21488kB

Looks like you are right. Do you have any suggestion on what should I do?
Decreasing the heap or increasing my RAM?

Cheers!

The immediate problem is running Elasticsearch, Logstash, Kibana, and nginx in a machine with 2 GB of RAM. Even if you drop the heap by half you're likely to still have trouble, and then you're more likely to run into heap space issues in Elasticsearch. I think you need to either get sone of those other processes off this host, or get more RAM.

Hi Jason,

I am going to bump it to a 4GB RAM, i hope this will help.

Would you know what is the normal RAM for ELK stack in a machine?

Cheers!

Hi, this would very much depend on what you are expecting use this server for.
If this is a server purpose built for testing 5.1.1 then of course the resources you will look to have may suffice. This all goes hand in hand with what you are looking to achieve here.

Hi Jymit,

In fact, I was thinking to make this a production ELK server. How do you guys normally judge the server requirement for ELK? based on number of docs coming in or?

Cheers!

Running all three on a single machine with only 4 GB might be too much, especially combined with an nginx server (it really depends on your use-case though). Elasticsearch loves the filesystem cache, but if all the memory not dedicated to the Elasticsearch heap is going to other processes, there is not going to be any room left over for the filesystem cache.

Hello Jason,

I've just trying to reduce the xms to 750mb and fortunately (finger crossed) the server has been running fine for a few days now. I am feeding it around 40k hits every 15mins or so.

Thank you for your help!

You're very welcome.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.