Currently the configuration is set to daily indices in a single-node, so we decided to move to weekly indices in order to reduce the number of shards in the cluster.
Most of the time the client wants a 6 month history, which means that we always have more than 1000 shards created, in some cases it reaches 4k or 5k, and as we can see below, many of the indices are much smaller than elasticsearch advises.
Facing issue:
After re-indexing the daily to weekly indexes, searches with sort were considerably slower, which goes against everything I've been reading, I thought that having fewer indexes would make searches more efficient since fewer shards would be consulted.
Test:
Daily indexes:
green open index-2023-01-01 yBiABwMPShuiQ03LeQ7DEg 1 0 132304 0 30.6mb 30.6mb
green open index-2023-01-02 c_kB9wioRP29fuM-_85a9g 1 0 175048 0 40.5mb 40.5mb
green open index-2023-01-03 -b5KDGthSuqnPBPH5tOnLA 1 0 184778 0 41.9mb 41.9mb
green open index-2023-01-04 MmMnmv1_QSu5R8Uha7TjSg 1 0 86324 0 18.3mb 18.3mb
.....
green open index-2023-01-31 mxWh8y4XS-amRQ-uHpDyNQ 1 0 240864 0 59.9mb 59.9mb
time curl -X GET "10.10.10.10:9200/index-2023-01-*/_search" -H 'Content-Type: application/json' -d '{"size": 10000, "track_total_hits": false, "query": {"bool": { "must": [ { "range": { "index.date": { "from": 1674208736000, "to": null, "include_lower": true, "include_upper": true, "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } }, "sort": [ { "index.date": { "order": "desc" } } ] }' | wc -l
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 13.5M 100 13.5M 100 308 14.9M 340 --:--:-- --:--:-- --:--:-- 14.9M
0
real 0m0.917s
user 0m0.012s
sys 0m0.022s
Weekly indexes:
green open index-test-2022-12-25 AWbqQg_eR_iOxtRQXXoRlw 1 0 770 0 144.9kb 144.9kb
green open index-test-2023-01-01 gKKuyq0eR1OfMAsLERGe7A 1 0 924199 0 206mb 206mb
green open index-test-2023-01-08 CEx-SpAlRQ6Lc84Fsqz07A 1 0 804620 0 167.9mb 167.9mb
green open index-test-2023-01-15 7Y3ctKjvTCaPYSnozPsImg 1 0 1137348 0 240.9mb 240.9mb
green open index-test-2023-01-22 GYLau7xsS3eCaVFv6vdPEw 1 0 1214504 0 274.8mb 274.8mb
green open index-test-2023-01-29 TxIrCV89QzKNv6nk_M9nBw 1 0 415912 0 99.4mb 99.4mb
time curl -X GET "10.10.10.10:9200/index-test-*/_search" -H 'Content-Type: application/json' -d '{"size": 10000, "track_total_hits": false, "query": {"bool": { "must": [ { "range": { "index.date": { "from": 1674208736000, "to": null, "include_lower": true, "include_upper": true, "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } }, "sort": [ { "index.date": { "order": "desc" } } ] }' | wc -l
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 13.6M 100 13.6M 100 308 10.2M 232 0:00:01 0:00:01 --:--:-- 10.2M
0
real 0m1.339s
user 0m0.006s
sys 0m0.038s
Does anyone have a tip? Could it be that I can only see improvements in a highly consulted scenario?