I just upgraded my cluster to 5.2 and in the process of moving some shards after restart one of the nodes just dies without any logs. The only thing I could find was this in syslog:
Feb 2 20:53:34 es-replicashard2 kernel: [896534.883938] Out of memory: Kill process 11816 (java) score 354 or sacrifice child
Feb 2 20:53:34 es-replicashard2 kernel: [896534.884353] Killed process 11816 (java) total-vm:805368608kB, anon-rss:31801460kB, file-rss:3147904kB
Feb 2 20:53:35 es-replicashard2 systemd: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Feb 2 20:53:35 es-replicashard2 systemd: elasticsearch.service: Unit entered failed state.
Feb 2 20:53:35 es-replicashard2 systemd: elasticsearch.service: Failed with result 'signal'.
I have around 3.6 billion documents. 8 nodes: 2 clients, 4 nodes master+data, 2 nodes just data. The memory for each node except the clients is 94GB with 28G heap size (also 14 core each). The settings for shards (allocations, transition, etc.) are the default. Also, each node has around 150-160 shards.
So my questions are:
- What does this out of memory indicate? Is this the heap size cause it says (java) or the total of the memory for the machine?
- Is there a limit in terms of memory for the number of shards that each node can handle? And if it's yes, how can I calculate the size of needed memory during shards transition after restart?
- I haven't seen out of memory for shard transitions in my cluster since the beginning (0.9). I've seen out of memory for big search/aggregations though (before version 2.0). I've also seen performance impact when you have lots of shards on the node. But I think my nodes are big enough for a moderate inserts/updates of 40-50 million per day and 20k-30k query/aggs per day. Am I wrong on this?
I always look at the disk or performance for adding new node to the cluster. I never though during the shards relocation there is a memory usage that I have to take into account. Kinda like index building of MongoDB when it runs out of memory if the index is bigger that the machine memory.
PS: I removed about 400M documents (packetbeat) and started the node again. The cluster is green.
Update: The cluster was green but it happened again!