CPU heavy load after indexing 1.5M vectors

Hi,
I want to build an index with more than 4M vectors of dimension 768. My setup for now is a DigitalOcean droplet with 4GB and 2vCPU (planning to increase as needed).

I first started to add the first million which seems to be ok. But now, I added and additional 500k vectors, and although my code ran ok and I count 1.5M vectors in my index, I now see a permanent process running in the background consuming a lot of cpu. It is now running for more than 12 hours. This is what I'm getting when printing es.nodes.hot_threads().

>>> print(es.nodes.hot_threads())
::: {julia-es}{S4521jOZTJaZR9KNKsMcMw}{LZe3maChRY-nfLeEkMb8ow}{julia-es}{127.0.0.1}{127.0.0.1:9300}{cdfhilmrstw}{8.8.1}{ml.allocated_processors_double=2.0, xpack.installed=true, ml.machine_memory=4101406720, ml.allocated_processors=2, ml.max_jvm_size=2051014656}
   Hot threads at 2023-07-10T07:50:34.163Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   100.0% [cpu=28.0%, other=72.0%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[julia-es][[julia_dgsi][0]: Lucene Merge Thread #58]'
     7/10 snapshots sharing following 22 elements
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues.vectorValue(OffHeapFloatVectorValues.java:61)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues$DenseOffHeapVectorValues.vectorValue(OffHeapFloatVectorValues.java:86)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.compare(HnswGraphSearcher.java:290)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:267)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:208)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:278)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:286)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addVectors(HnswGraphBuilder.java:235)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.build(HnswGraphBuilder.java:162)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.Lucene95HnswVectorsWriter.mergeOneField(Lucene95HnswVectorsWriter.java:477)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:117)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:98)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:255)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger$$Lambda$7848/0x00000008022d7cc0.merge(Unknown Source)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:298)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:149)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5140)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4680)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6432)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639)
       app/org.elasticsearch.server@8.8.1/org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:118)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)
     3/10 snapshots sharing following 22 elements
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues.vectorValue(OffHeapFloatVectorValues.java:61)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues$DenseOffHeapVectorValues.vectorValue(OffHeapFloatVectorValues.java:86)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.compare(HnswGraphSearcher.java:290)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:267)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:208)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:273)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:286)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.addVectors(HnswGraphBuilder.java:235)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.util.hnsw.HnswGraphBuilder.build(HnswGraphBuilder.java:162)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.lucene95.Lucene95HnswVectorsWriter.mergeOneField(Lucene95HnswVectorsWriter.java:477)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:117)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:98)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:255)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger$$Lambda$7848/0x00000008022d7cc0.merge(Unknown Source)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:298)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:149)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5140)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4680)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6432)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639)
       app/org.elasticsearch.server@8.8.1/org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:118)
       app/org.apache.lucene.core@9.6.0/org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)

This is an indication that you don't have enough memory allocated for file system cache (memory outside of JVM). And because of this the segments get merge, vector values have to be read from disk instead of memory which significantly slows down merging.

You can check our guide how to calculate the approximate amount of memory needed.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.