Elasticsearch sizing

I have tried for three weeks now to get a ES node/cluster to be able to ingest about 80.000 documents in bulk without crashing.

I recieve the documents with Logstash before sending them to ES.

I have 1 data node with 2 vCPU and 4GB ram. 6 data disks with 500 IOPS each.

No matter how I tweak things ES inevitably gets OOM killed. Renting a cloud server with 32 GB RAM just in case one batch of documents arrives is not an option. There has to be /some/ way to make ES allocate it's resources in a way that doesn't suicide.

I've seen people mentioning that using dynamic heap sizing in Java is not THAT big of a problem any more. However, due to bootstrap checks ES can only be started with X GB heap. No more. No less. Ever.

This makes it extremely challenging to enable ES to queue and take in a peak of data, while conserving resources when usage is low.

I will not be in charge of schemas of incoming documents and indexes. So I cannot fiddle with individual field mapping and index tricks to speed things up. That things take (a loong) time to process when coming in is acceptable. Just crashing in a loop forever, less so.

I have also tried finding ways to have filebeat and Logstash limit their throughput, but they hose the poor data node every freaking time :frowning:

How big are your bulk queries? i.e. how many documents Logstash sends per bulk and what volume in MB does it represent?

Can you tell the root cause of the OOM (many different things can be causing this)? Any stack trace?

Do you also have search requests going on when indexing your bulk?

How big is your current index? How many shards does it have?

Filebeat and Logstash by default uses quite small bulk requests over a few connections in parallel in order not to overload the cluster. Are you using these with default settings??

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.