I have followed the advice in the aKNN tuning guide:
But no matter the settings, the indexing process still creates a huge tail of tiny segments.
Setup:
- New dev deployment
- Zero search traffic
- 64GB, CPU optimized
"indices.memory.index_buffer_size": "10%"
"index.translog.flush_threshold_size": "10gb"
"index.refresh_interval": "-1"
On a 64GB node, I understand half is allocated to the JVM heap, so we have 32GB of memory. If index_buffer_size
is 10%, then that should be 3.2GB of buffer.
With these settings, how in the world do I end up with literally dozens of segments that are <100MB ???
Here's an example of one shard's segments:
s segment size docs.count
0 _co2 510.6kb 31
0 _cou 716.3kb 44
0 _coq 1.7mb 111
0 _cot 3.1mb 199
0 _cop 5.5mb 354
0 _cnx 5.8mb 376
0 _co7 7.3mb 471
0 _co6 8.4mb 542
0 _cor 11.7mb 1986
0 _co1 12.2mb 1972
0 _cos 13.7mb 881
0 _cog 14.1mb 907
0 _cnv 15.3mb 981
0 _coo 18.3mb 1175
0 _cow 20.2mb 1296
0 _com 31.6mb 2030
0 _cov 40.1mb 2577
0 _cod 81.5mb 5234
0 _coe 84.8mb 5453
0 _col 87.7mb 5641
0 _cmv 106.6mb 6852
0 _cok 109.5mb 7040
0 _cja 131mb 8419
0 _coj 150.7mb 30500
0 _cob 335.3mb 54380
0 _cjr 584mb 48689
0 _cmn 707.9mb 59709
0 _cgn 732.6mb 198037
0 _ckb 868.9mb 241883
0 _c8x 962.4mb 90502
0 _c60 1021.7mb 97412
0 _cn6 1.3gb 377820
0 _2ut 1.4gb 393405
0 _a3v 1.5gb 438436
0 _bcu 1.6gb 469321
0 _7e4 1.6gb 483486
0 _2oa 1.7gb 492613
0 _8rc 1.7gb 500500
0 _32p 1.8gb 499522
0 _a7z 1.9gb 546017
0 _bzy 1.9gb 556732
0 _73j 2gb 580960
0 _4o6 2gb 592797
0 _6k3 2.3gb 660575
0 _9t0 2.4gb 676770
0 _5lu 2.5gb 723636
0 _7oi 2.6gb 752804
0 _57c 2.6gb 756930
0 _aq8 2.6gb 762371
0 _96j 2.7gb 776362
0 _3k1 2.8gb 795040
0 _ccf 2.8gb 816245
0 _wp 3.1gb 889646
0 _axy 3.1gb 892622
0 _bng 3.3gb 921377
0 _xc 3.4gb 972174
0 _9ai 3.8gb 1103511
0 _628 4gb 1142051
0 _310 4.2gb 1199113
0 _agl 4.5gb 1284416
0 _9td 4.5gb 1299733
0 _67z 4.6gb 1317589
0 _86m 4.6gb 1316734
0 _4xf 4.7gb 1353413
0 _c12 4.7gb 1351662
0 _75i 4.8gb 1362717
0 _bcn 4.8gb 1372132
0 _428 4.9gb 1394784
0 _83p 4.9gb 1404350