Hi,
I try to use Elasticsearch to index about 500M of docs. I firstly
tried to index documents with 15 fields and the size for a single doc
is about 1,2KB.
While indexing these data I got a problem with elasticsearch
efficiency when an used heap size grows to maximum size (setting to
4GB). After that the indexing process slows down very much, listed
below:
Sat Jul 02 01:48:01 CEST 2011 Number of records indexed with Elastic
Search: 101890000 (3.775s), from start: (52625.54s)
Sat Jul 02 01:48:06 CEST 2011 Number of records indexed with Elastic
Search: 101900000 (4.702s), from start: (52630.243s)
Sat Jul 02 01:48:11 CEST 2011 Number of records indexed with Elastic
Search: 101910000 (5.05s), from start: (52635.294s)
...
Sat Jul 02 02:37:44 CEST 2011 Number of records indexed with Elastic
Search: 104280000 (56.441s), from start: (55608.197s)
Sat Jul 02 02:38:27 CEST 2011 Number of records indexed with Elastic
Search: 104290000 (43.182s), from start: (55651.379s)
Sat Jul 02 02:39:25 CEST 2011 Number of records indexed with Elastic
Search: 104300000 (57.847s), from start: (55709.226s)
...
Mon Jul 04 00:52:11 CEST 2011 Number of records indexed with Elastic
Search: 108220000 (11860.994s), from start: (222075.855s)
Mon Jul 04 03:59:51 CEST 2011 Number of records indexed with Elastic
Search: 108230000 (11259.611s), from start: (233335.466s)
Mon Jul 04 06:59:22 CEST 2011 Number of records indexed with Elastic
Search: 108240000 (10770.902s), from start: (244106.368s)
As you can see, the time need to index 10k docs grows and grows.
In the ES server logs I can see that used memory hit a maximum heap
size at ~Sat Jul 02 02:30 (listed below).
[2011-07-02 02:28:06,776][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158981] took [41ms]/[46.8m], reclaimed [61.9mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:28:07,819][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158982] took [55ms]/[46.8m], reclaimed [65.8mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:28:20,238][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158983] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:29:32,037][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:30:01,741][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158984] took [68ms]/[46.8m], reclaimed [60.9mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:30:14,253][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158986] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:30:30,155][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158987] took [48ms]/[46.8m], reclaimed [63.5mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:30:31,194][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158988] took [52ms]/[46.8m], reclaimed [65.9mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:30:42,809][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158989] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:30:56,780][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158990] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:31:05,966][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158991] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:31:06,516][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][1] merge [_1jpv] done, took [36s]
[2011-07-02 02:31:57,108][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158992] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:33:07,401][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158993] took [44ms]/[46.8m], reclaimed [60.6mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:33:08,456][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158994] took [56ms]/[46.8m], reclaimed [64.7mb], leaving
[3.9gb] used, max [4.1gb]
[2011-07-02 02:33:22,535][DEBUG][monitor.jvm ] [Vibraxas]
[gc][ParNew][158995] took [0s]/[46.8m], reclaimed [0b], leaving [4gb]
used, max [4.1gb]
[2011-07-02 02:39:18,198][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:39:32,931][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:40:38,400][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:40:38,859][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][0] merge [_1jt1] done, took [51s]
[2011-07-02 02:42:12,188][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:42:26,892][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:43:08,566][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:43:17,608][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][1] merge [_1js1] done, took [22.8s]
[2011-07-02 02:43:31,564][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][4] merge [_1jtl] done, took [50.8s]
[2011-07-02 02:44:14,349][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:45:38,831][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][2] merge [_1jsr] done, took [55.5s]
[2011-07-02 02:46:49,777][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.2s]
[2011-07-02 02:48:50,874][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:49:33,221][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.2s]
[2011-07-02 02:49:47,456][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
[2011-07-02 02:50:01,932][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.2s]
[2011-07-02 02:50:15,933][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:50:25,408][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][3] merge [_1jtg] done, took [1.3m]
[2011-07-02 02:50:53,094][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:51:02,304][DEBUG][index.merge.scheduler ] [Vibraxas]
[archive][1] merge [_1jsz] done, took [1.2m]
[2011-07-02 02:52:12,575][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.1s]
[2011-07-02 02:53:32,671][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5.2s]
[2011-07-02 02:54:58,395][INFO ][monitor.jvm ] [Vibraxas]
[gc][ConcurrentMarkSweep] collection occurred, took [5s]
When I increase the max heap size then slowing down appears anyway
(but later).
So my question is. ES node is limited to some quantity of indexed
docs? Or it's just a matter of bad config of my server?
I hope profiler stats for a server can be also helpful:
Memory allocations (desc my shallow size - csv format)
"Name","Number of Objects","Shallow Size","Objects (+/-)","Size (+/-)"
"long[]","2054","1279071528","",""
"char[]","17031494","759227928","",""
"org.apache.lucene.index.TermInfo","16989024","679560960","",""
"java.lang.String","17024187","544773984","",""
"int[]","8363","429248832","",""
"org.apache.lucene.index.Term","16991252","407790048","",""
"org.apache.lucene.index.Term[]","230","67860176","",""
"org.apache.lucene.index.TermInfo[]","229","67860160","",""
CPU:
"Name","Time (ms)","Samples"
"","32742","1359"
"org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run()","28077","195" (~86%)
"org.elasticsearch.index.store.support.AbstractStore
$StoreIndexOutput.writeByte(byte)","4268","26"
"org.elasticsearch.index.store.support.AbstractStore
$StoreIndexOutput.writeByte(byte)","1496","10"
"org.elasticsearch.index.store.support.AbstractStore
$StoreIndexOutput.writeByte(byte)","1248","6"
"org.elasticsearch.index.store.support.AbstractStore
$StoreIndexOutput.writeByte(byte)","296","2"
"org.elasticsearch.index.store.support.AbstractStore
$StoreIndexOutput.writeBytes(byte[], int, int)","140","1"
"java.lang.Thread.run()","4616","1050" (~14%)
"org.elasticsearch.common.netty.util.internal.DeadLockProofWorker
$1.run()","1048","684"
"org.elasticsearch.index.shard.service.InternalIndexShard
$EngineRefresher$1.run()","736","17"
"org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction$1.run()","420","19"
"org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush
$1.run()","260","4"
"org.elasticsearch.threadpool.ThreadPool
$LoggingRunnable.run()","240","17"
"org.elasticsearch.index.shard.service.InternalIndexShard$EngineMerger
$1.run()","72","16"
"org.elasticsearch.index.shard.service.InternalIndexShard
$EngineRefresher.run()","52","3"
"org.elasticsearch.index.shard.service.InternalIndexShard
$EngineMerger.run()","40","1"
"org.elasticsearch.discovery.zen.ping.multicast.MulticastZenPing
$Receiver.run()","16","38"
"org.elasticsearch.threadpool.ThreadPool
$EstimatedTimeThread.run()","28","38"
"java.lang.ref.Reference$ReferenceHandler.run()","12","38"
"java.lang.ref.Finalizer$FinalizerThread.run()","8","38"
Thanks for your help.