Elasticsearch sizing

StianOvrevage · December 30, 2017, 6:34am

I have tried for three weeks now to get a ES node/cluster to be able to ingest about 80.000 documents in bulk without crashing.

I recieve the documents with Logstash before sending them to ES.

I have 1 data node with 2 vCPU and 4GB ram. 6 data disks with 500 IOPS each.

No matter how I tweak things ES inevitably gets OOM killed. Renting a cloud server with 32 GB RAM just in case one batch of documents arrives is not an option. There has to be /some/ way to make ES allocate it's resources in a way that doesn't suicide.

I've seen people mentioning that using dynamic heap sizing in Java is not THAT big of a problem any more. However, due to bootstrap checks ES can only be started with X GB heap. No more. No less. Ever.

This makes it extremely challenging to enable ES to queue and take in a peak of data, while conserving resources when usage is low.

I will not be in charge of schemas of incoming documents and indexes. So I cannot fiddle with individual field mapping and index tricks to speed things up. That things take (a loong) time to process when coming in is acceptable. Just crashing in a loop forever, less so.

I have also tried finding ways to have filebeat and Logstash limit their throughput, but they hose the poor data node every freaking time

val · December 30, 2017, 6:41am

How big are your bulk queries? i.e. how many documents Logstash sends per bulk and what volume in MB does it represent?

Can you tell the root cause of the OOM (many different things can be causing this)? Any stack trace?

Do you also have search requests going on when indexing your bulk?

How big is your current index? How many shards does it have?

Christian_Dahlqvist · December 30, 2017, 9:34am

Filebeat and Logstash by default uses quite small bulk requests over a few connections in parallel in order not to overload the cluster. Are you using these with default settings??

system · January 27, 2018, 9:34am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance weird stuff Elasticsearch	13	875	September 25, 2020
Using the Bulk Indexing API, if my node crashes, my elasticsearch heap memory does not get freed Elasticsearch	6	800	July 6, 2017
Looking for advice on bulk loading Elasticsearch	6	894	July 6, 2017
Aggregate query: Elasticsearch:java.lang.OutOfMemoryError: Java heap space Elasticsearch	8	1483	July 25, 2019
Elasticsearch heap issues Elasticsearch	4	441	July 5, 2017

Elasticsearch sizing

Related topics