Hello,
I was under the assumption that the disk space used by a shard is the sum of the disk space of its segments.
I have and index with 1 Million document and 3 shards and it seems to me that this is not true. Can someone tell me if this is ok or it should be further investigated? I could not find any pointer in the documentation.
This is the output of a call to
_cat/shards/xxx?v&h=shard,state,index,id,prirep,state,docs,store,merges.*
shard state index id prirep docs store merges.current merges.current_docs merges.current_size merges.total merges.total_docs merges.total_size merges.total_time
0 STARTED xxx.2023-12-19.11.53 haMvMi8fToesbgjwHYFc3A r 339904 3.7gb 0 0 0b 905 2762242 29.1gb 1.1h
0 STARTED xxx.2023-12-19.11.53 Yqoddn-tTfK5ggHi5_eYAw p 339904 8.2gb 0 0 0b 921 2811980 30.1gb 1.1h
1 STARTED xxx.2023-12-19.11.53 w7DSKeH9SZa_2C72ulKGmw p 341076 15.9gb 0 0 0b 878 2722777 27.7gb 1.1h
1 STARTED xxx.2023-12-19.11.53 Yqoddn-tTfK5ggHi5_eYAw r 341076 5.1gb 0 0 0b 872 2358927 24.4gb 48m
2 STARTED xxx.2023-12-19.11.53 w7DSKeH9SZa_2C72ulKGmw r 342105 14.7gb 0 0 0b 910 2492439 25.9gb 59.7m
2 STARTED xxx.2023-12-19.11.53 haMvMi8fToesbgjwHYFc3A p 342105 3.7gb 0 0 0b 909 2753674 27.9gb 1h
as you can see the size of the primary of shard 1 is 15.6 Gb
This is an extract of a GET to
_cat/segments/superevadb?v&s=p,shard,size:desc
It seems to me that the cumulative size of all the segments is less than 4Gb
index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound
xxx.2023-12-19.11.53 1 p 10.2.13.47 _5z5 7745 312399 29049 3gb 0 true true 9.8.0 false
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6ry 8782 9225 2512 173.3mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6fa 8326 8876 2853 163.6mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6is 8452 5430 3629 140.2mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6st 8813 1073 309 20.3mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6tz 8855 1277 1 19.3mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6to 8844 683 4 14.7mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6t3 8823 701 104 13.2mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6t7 8827 223 121 9.1mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6sj 8803 759 9 8.3mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6tf 8835 186 0 5mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6tk 8840 226 0 4.7mb 0 true true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6ua 8866 9 1 252.6kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6u4 8860 2 0 97.3kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6u3 8859 1 0 78.9kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6ud 8869 1 0 73kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6uc 8868 1 0 72.4kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6u0 8856 1 0 70.2kb 0 true false 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6u2 8858 1 0 68kb 0 true false 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6u1 8857 1 0 65.2kb 0 true false 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6ub 8867 1 0 62.4kb 0 false true 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6tr 8847 1 0 59.1kb 0 true false 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6ts 8848 1 0 58.9kb 0 true false 9.8.0 true
xxx.2023-12-19.11.53 1 p 10.2.13.47 _6tv 8851 1 0 48.4kb 0 true false 9.8.0 true
For other shards instead the sizes are similar. I am looking into this because after the upgrade to elastic 8.11.1 from elastic7 I am noticing an increase in the usage of disk space. After a restart of the nodes of the cluster the usage of disk goes suddenly down and then increase gradually over the hours and I can't find what I am doing wrong
Thank you,
Tommaso