I have tried for three weeks now to get a ES node/cluster to be able to ingest about 80.000 documents in bulk without crashing.
I recieve the documents with Logstash before sending them to ES.
I have 1 data node with 2 vCPU and 4GB ram. 6 data disks with 500 IOPS each.
No matter how I tweak things ES inevitably gets OOM killed. Renting a cloud server with 32 GB RAM just in case one batch of documents arrives is not an option. There has to be /some/ way to make ES allocate it's resources in a way that doesn't suicide.
I've seen people mentioning that using dynamic heap sizing in Java is not THAT big of a problem any more. However, due to bootstrap checks ES can only be started with X GB heap. No more. No less. Ever.
This makes it extremely challenging to enable ES to queue and take in a peak of data, while conserving resources when usage is low.
I will not be in charge of schemas of incoming documents and indexes. So I cannot fiddle with individual field mapping and index tricks to speed things up. That things take (a loong) time to process when coming in is acceptable. Just crashing in a loop forever, less so.
I have also tried finding ways to have filebeat and Logstash limit their throughput, but they hose the poor data node every freaking time