I am trying to index 114mil rows of data with 150mil attachments from a PeopleSoft application to ES 6.1 (due to compatibility with PeopleSoft, can't upgrade to higher versions).
Elastic Configuration:
A single node (Coordinator, ingest and data node) cluster with
OS: Oracle Linux 5.4.17-2136.336.5.1.el7uek.x86_64 x86_64
OCPU: 32
Memory: 128GB
Disk: 10TB
elasticsearch.yml options:
http.max_content_length: 300mb
indices.memory.index_buffer_size: 20%
bootstrap.memory_lock: true
jvm.options
-Xms31g
-Xmx31g
-XX:ParallelGCThreads=20
-XX:NewRatio=2
"refresh_interval" : "-1"
Number of shards: 200
Replicas: 0
Attachment Handlers: 10
Max Sub Queue Size: 20
Full Direct Transfer: enabled
Index segment size: 10mb
Bulk transfer enabled
The first 40mil rows indexed at the rate of 40000/min and noticed a gradual slowdown of throughput to 20000/min by the time it indexed 50mil rows.
And then noticed a further slowdown ~4500/min by the time it indexed 55mil rows.
The hot thread output shows the following
Hot threads at 2024-12-11T02:21:46.612Z, interval=500ms, busiestThreads=99999, ignoreIdleThreads=true:
100.2% (500.7ms out of 500ms) cpu usage by thread 'elasticsearch[RLskF8A][bulk][T#20]'
7/10 snapshots sharing following 24 elements
org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:147)
org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319)
org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:117)
org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:168)
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
org.apache.tika.Tika.parseToString(Tika.java:568)
org.elasticsearch.ingest.attachment.TikaImpl.lambda$0(TikaImpl.java:110)
org.elasticsearch.ingest.attachment.TikaImpl$$Lambda$1926/149886079.run(Unknown Source)
java.security.AccessController.doPrivileged(Native Method)
org.elasticsearch.ingest.attachment.TikaImpl.parse(TikaImpl.java:109)
org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:88)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:58)
org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:169)
org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:42)
org.elasticsearch.ingest.PipelineExecutionService$2.doRun(PipelineExecutionService.java:94)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
2/10 snapshots sharing following 23 elements
org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:286)
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:189)
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:176)
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
org.apache.tika.Tika.parseToString(Tika.java:568)
org.elasticsearch.ingest.attachment.TikaImpl.lambda$0(TikaImpl.java:110)
org.elasticsearch.ingest.attachment.TikaImpl$$Lambda$1926/149886079.run(Unknown Source)
java.security.AccessController.doPrivileged(Native Method)
org.elasticsearch.ingest.attachment.TikaImpl.parse(TikaImpl.java:109)
org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:88)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:58)
org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:169)
org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:42)
org.elasticsearch.ingest.PipelineExecutionService$2.doRun(PipelineExecutionService.java:94)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
unique snapshot
java.util.zip.Inflater.inflateBytes(Native Method)
java.util.zip.Inflater.inflate(Inflater.java:259)
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:220)
com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2942)
com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:303)
com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1895)
com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1551)
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2823)
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:113)
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:507)
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:867)
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:796)
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:142)
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:247)
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140)
org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:163)
org.openxmlformats.schemas.spreadsheetml.x2006.main.CommentsDocument$Factory.parse(Unknown Source)
org.apache.poi.xssf.model.CommentsTable.readFrom(CommentsTable.java:72)
org.apache.poi.xssf.model.CommentsTable.<init>(CommentsTable.java:67)
org.apache.poi.xssf.eventusermodel.XSSFReader$SheetIterator.getSheetComments(XSSFReader.java:343)
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:157)
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135)
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120)
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143)
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106)
org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
org.apache.tika.Tika.parseToString(Tika.java:568)
org.elasticsearch.ingest.attachment.TikaImpl.lambda$0(TikaImpl.java:110)
org.elasticsearch.ingest.attachment.TikaImpl$$Lambda$1926/149886079.run(Unknown Source)
java.security.AccessController.doPrivileged(Native Method)
org.elasticsearch.ingest.attachment.TikaImpl.parse(TikaImpl.java:109)
org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:88)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)
org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:58)
org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:169)
org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:42)
org.elasticsearch.ingest.PipelineExecutionService$2.doRun(PipelineExecutionService.java:94)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
98.8% (494ms out of 500ms) cpu usage by thread 'elasticsearch[RLskF8A][management][T#3]'
2/10 snapshots sharing following 24 elements
java.nio.file.Files.newDirectoryStream(Files.java:457)
org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:215)
org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:234)
org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:57)
org.elasticsearch.index.store.Store$StoreStatsCache.estimateSize(Store.java:1418)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1410)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1399)
org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54)
org.elasticsearch.index.store.Store.stats(Store.java:349)
org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:947)
org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:178)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:164)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:45)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:433)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:412)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:399)
org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:652)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
7/10 snapshots sharing following 37 elements
java.io.UnixFileSystem.canonicalize0(Native Method)
java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172)
java.io.File.getCanonicalPath(File.java:618)
java.io.FilePermission$1.run(FilePermission.java:224)
java.io.FilePermission$1.run(FilePermission.java:212)
java.security.AccessController.doPrivileged(Native Method)
java.io.FilePermission.init(FilePermission.java:212)
java.io.FilePermission.<init>(FilePermission.java:299)
java.lang.SecurityManager.checkRead(SecurityManager.java:888)
sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
java.nio.file.Files.readAttributes(Files.java:1737)
java.nio.file.Files.size(Files.java:2332)
org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:67)
org.elasticsearch.index.store.Store$StoreStatsCache.estimateSize(Store.java:1421)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1410)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1399)
org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54)
org.elasticsearch.index.store.Store.stats(Store.java:349)
org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:947)
org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:178)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:164)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:45)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:433)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:412)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:399)
org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:652)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
unique snapshot
sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:286)
sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:52)
sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
java.nio.file.Files.readAttributes(Files.java:1737)
java.nio.file.Files.size(Files.java:2332)
org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:67)
org.elasticsearch.index.store.Store$StoreStatsCache.estimateSize(Store.java:1421)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1410)
org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1399)
org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54)
org.elasticsearch.index.store.Store.stats(Store.java:349)
org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:947)
org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:178)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:164)
org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:45)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:433)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:412)
org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:399)
org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:652)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
92.5% (462.3ms out of 500ms) cpu usage by thread 'elasticsearch[RLskF8A][refresh][T#5]'
7/10 snapshots sharing following 28 elements
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:905)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:869)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:343)
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:140)
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:108)
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:162)
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:451)
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:542)
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:658)
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:453)
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:293)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:268)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:258)
org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:104)
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1336)
org.elasticsearch.index.engine.InternalEngine.writeIndexingBuffer(InternalEngine.java:1366)
org.elasticsearch.index.shard.IndexShard.writeIndexingBuffer(IndexShard.java:1702)
org.elasticsearch.indices.IndexingMemoryController$1.doRun(IndexingMemoryController.java:177)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
2/10 snapshots sharing following 30 elements
org.apache.lucene.store.DataOutput.writeVInt(DataOutput.java:191)
org.apache.lucene.codecs.lucene50.Lucene50PostingsWriter.finishTerm(Lucene50PostingsWriter.java:392)
org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:169)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:864)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:343)
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:140)
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:108)
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:162)
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:451)
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:542)
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:658)
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:453)
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:293)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:268)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:258)
org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:104)
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1336)
org.elasticsearch.index.engine.InternalEngine.writeIndexingBuffer(InternalEngine.java:1366)
org.elasticsearch.index.shard.IndexShard.writeIndexingBuffer(IndexShard.java:1702)
org.elasticsearch.indices.IndexingMemoryController$1.doRun(IndexingMemoryController.java:177)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
unique snapshot
org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:122)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:864)
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:343)
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:140)
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:108)
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:162)
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:451)
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:542)
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:658)
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:453)
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:293)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:268)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:258)
org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:104)
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1336)
org.elasticsearch.index.engine.InternalEngine.writeIndexingBuffer(InternalEngine.java:1366)
org.elasticsearch.index.shard.IndexShard.writeIndexingBuffer(IndexShard.java:1702)
org.elasticsearch.indices.IndexingMemoryController$1.doRun(IndexingMemoryController.java:177)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:637)
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
89.0% (444.8ms out of 500ms) cpu usage by thread 'elasticsearch[RLskF8A][refresh][T#4]'
3/10 snapshots sharing following 29 elements
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:162)
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:451)
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:542)
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:658)
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:453)
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:293)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:268)
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:258)
org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:104)
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
org.elasticsearch.index.engine.InternalEngine$ExternalSearcherManager.refreshIfNeeded(InternalEngine.java:292)
org.elasticsearch.index.engine.InternalEngine$ExternalSearcherManager.refreshIfNeeded(InternalEngine.java:267)
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1332)
org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1314)
org.elasticsearch.index.shard.IndexShard.refresh(IndexShard.java:855)
org.elasticsearch.index.IndexService.maybeRefreshEngine(IndexService.java:695)
org.elasticsearch.index.IndexService.access$400(IndexService.java:97)
org.elasticsearch.index.IndexService$AsyncRefreshTask.runInternal(IndexService.java:899)
org.elasticsearch.index.IndexService$BaseAsyncTask.run(IndexService.java:809)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:568)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
The cluster stats shows the following.
{
"_shards" : {
"total" : 200,
"successful" : 200,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 55543556,
"deleted" : 47781
},
"store" : {
"size_in_bytes" : 2672291755255
},
"indexing" : {
"index_total" : 4120398,
"index_time_in_millis" : 50952877,
"index_current" : 4,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 1,
"time_in_millis" : 1,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 1,
"missing_time_in_millis" : 1,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 1600,
"query_time_in_millis" : 288,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 22,
"current_docs" : 1629270,
"current_size_in_bytes" : 92451402153,
"total" : 3515,
"total_time_in_millis" : 789186847,
"total_docs" : 18493872,
"total_size_in_bytes" : 1133936081916,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 141446873,
"total_auto_throttle_in_bytes" : 1505047452
},
"refresh" : {
"total" : 29742, 30413, 30565 (immediately after change) 2:10pm, 30613
"total_time_in_millis" : 60025231,
"listeners" : 0
},
"flush" : {
"total" : 600,
"total_time_in_millis" : 1905576
},
"warmer" : {
"current" : 0,
"total" : 26348,
"total_time_in_millis" : 2047
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5537,
"memory_in_bytes" : 12468548298,
"terms_memory_in_bytes" : 12384593318,
"stored_fields_memory_in_bytes" : 47619256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 21696640,
"points_memory_in_bytes" : 3065192,
"doc_values_memory_in_bytes" : 11573892,
"index_writer_memory_in_bytes" : 6503754102,
"version_map_memory_in_bytes" : 12366103,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 2133296,
"size_in_bytes" : 114717685968,
"uncommitted_operations" : 1159571,
"uncommitted_size_in_bytes" : 62252691813
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 386,
"miss_count" : 1214
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"total" : {
"docs" : {
"count" : 55543556,
"deleted" : 47781
},
"store" : {
"size_in_bytes" : 2672291755255
},
"indexing" : {
"index_total" : 4120398,
"index_time_in_millis" : 50952877,
"index_current" : 4,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 1,
"time_in_millis" : 1,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 1,
"missing_time_in_millis" : 1,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 1600,
"query_time_in_millis" : 288,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 22,
"current_docs" : 1629270,
"current_size_in_bytes" : 92451402153,
"total" : 3515,
"total_time_in_millis" : 789186847,
"total_docs" : 18493872,
"total_size_in_bytes" : 1133936081916,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 141446873,
"total_auto_throttle_in_bytes" : 1505047452
},
"refresh" : {
"total" : 29742,
"total_time_in_millis" : 60025231,
"listeners" : 0
},
"flush" : {
"total" : 600,
"total_time_in_millis" : 1905576
},
"warmer" : {
"current" : 0,
"total" : 26348,
"total_time_in_millis" : 2047
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5537,
"memory_in_bytes" : 12468548298,
"terms_memory_in_bytes" : 12384593318,
"stored_fields_memory_in_bytes" : 47619256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 21696640,
"points_memory_in_bytes" : 3065192,
"doc_values_memory_in_bytes" : 11573892,
"index_writer_memory_in_bytes" : 6503754102,
"version_map_memory_in_bytes" : 12366103,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 2133296,
"size_in_bytes" : 114717685968,
"uncommitted_operations" : 1159571,
"uncommitted_size_in_bytes" : 62252691813
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 386,
"miss_count" : 1214
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
}
},
"indices" : {
"icc_case_notes_crpte" : {
"primaries" : {
"docs" : {
"count" : 55543556,
"deleted" : 47781
},
"store" : {
"size_in_bytes" : 2672291755255
},
"indexing" : {
"index_total" : 4120398,
"index_time_in_millis" : 50952877,
"index_current" : 4,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 1,
"time_in_millis" : 1,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 1,
"missing_time_in_millis" : 1,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 1600,
"query_time_in_millis" : 288,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 22,
"current_docs" : 1629270,
"current_size_in_bytes" : 92451402153,
"total" : 3515,
"total_time_in_millis" : 789186847,
"total_docs" : 18493872,
"total_size_in_bytes" : 1133936081916,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 141446873,
"total_auto_throttle_in_bytes" : 1505047452
},
"refresh" : {
"total" : 29742,
"total_time_in_millis" : 60025231,
"listeners" : 0
},
"flush" : {
"total" : 600,
"total_time_in_millis" : 1905576
},
"warmer" : {
"current" : 0,
"total" : 26348,
"total_time_in_millis" : 2047
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5537,
"memory_in_bytes" : 12468548298,
"terms_memory_in_bytes" : 12384593318,
"stored_fields_memory_in_bytes" : 47619256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 21696640,
"points_memory_in_bytes" : 3065192,
"doc_values_memory_in_bytes" : 11573892,
"index_writer_memory_in_bytes" : 6503754102,
"version_map_memory_in_bytes" : 12366103,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 2133296,
"size_in_bytes" : 114717685968,
"uncommitted_operations" : 1159571,
"uncommitted_size_in_bytes" : 62252691813
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 386,
"miss_count" : 1214
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"total" : {
"docs" : {
"count" : 55543556,
"deleted" : 47781
},
"store" : {
"size_in_bytes" : 2672291755255
},
"indexing" : {
"index_total" : 4120398,
"index_time_in_millis" : 50952877,
"index_current" : 4,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 1,
"time_in_millis" : 1,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 1,
"missing_time_in_millis" : 1,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 1600,
"query_time_in_millis" : 288,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 22,
"current_docs" : 1629270,
"current_size_in_bytes" : 92451402153,
"total" : 3515,
"total_time_in_millis" : 789186847,
"total_docs" : 18493872,
"total_size_in_bytes" : 1133936081916,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 141446873,
"total_auto_throttle_in_bytes" : 1505047452
},
"refresh" : {
"total" : 29742,
"total_time_in_millis" : 60025231,
"listeners" : 0
},
"flush" : {
"total" : 600,
"total_time_in_millis" : 1905576
},
"warmer" : {
"current" : 0,
"total" : 26348,
"total_time_in_millis" : 2047
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5537,
"memory_in_bytes" : 12468548298,
"terms_memory_in_bytes" : 12384593318,
"stored_fields_memory_in_bytes" : 47619256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 21696640,
"points_memory_in_bytes" : 3065192,
"doc_values_memory_in_bytes" : 11573892,
"index_writer_memory_in_bytes" : 6503754102,
"version_map_memory_in_bytes" : 12366103,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 2133296,
"size_in_bytes" : 114717685968,
"uncommitted_operations" : 1159571,
"uncommitted_size_in_bytes" : 62252691813
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 386,
"miss_count" : 1214
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
}
}
}
}
The Top shows the following
top - 15:17:00 up 12 days, 20:33, 3 users, load average: 2.68, 3.02, 3.19
Tasks: 431 total, 1 running, 241 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.1 us, 0.0 sy, 0.0 ni, 96.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 26338697+total, 3445884 free, 37108092 used, 22283299+buff/cache
KiB Swap: 8388604 total, 8312060 free, 76544 used. 22385448+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24185 psoft 20 0 2200.6g 155.1g 121.7g S 100.0 61.7 3314:20 java
30155 psoft 20 0 174556 4984 3988 R 0.3 0.0 0:00.04 top
Wondering whats causing the indexing rate to slow down and what can be done to improve it?