High JVM heap during upsert

Farhaan · November 4, 2024, 11:35am

We are using 4Gb memory Storage optimized (dense) instance in elastic cloud for storing millions of token with embeddings.
We do an upsert operation to index embeddings.
We follow two set process
1- Index tokens in Elasticsearch without embeddings.
2- Update the indexed tokens with embeddings.

Update operation is continous as long as there are tokens without embeddings

We are observing that after 1hr of updates, heap memory of the nodes breaches its limit of 1.9Gb, causing the node to go down.
At this time, old GC is triggerred.
I am happy to share metrics when this occurs (monitoring is enabled on cluster)

Can someone please help on the RCA of this issue.
I see this thread when heap crosses limits

100.0% [cpu=78.8%, other=21.2%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000018][[semantic-data-index][0]: Lucene Merge Thread #0]'
2/10 snapshots sharing following 25 elements
app/org.apache.lucene.core@9.10.0/org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport.squareDistanceBody256(PanamaVectorUtilSupport.java:540)
app/org.apache.lucene.core@9.10.0/org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport.squareDistance(PanamaVectorUtilSupport.java:522)

Farhaan · November 4, 2024, 11:35am

Removed #elastic-cloud

Topic		Replies	Views
High heap during indexing documents Elasticsearch	4	1982	April 12, 2017
Throttling / Forcing Garbage Collection during Bulk Indexing Elasticsearch	4	2999	July 6, 2017
JVM Heap usage spikes Elasticsearch	3	1111	July 5, 2017
Elastic Search 2.2 Occupying a lot of ram which leads to GC Memory Pressure Elasticsearch	12	789	May 7, 2019
What happens when you go over 32GiB of JVM heap memory? Elasticsearch	6	965	April 5, 2024

High JVM heap during upsert

Related topics