ElasticSearch Split API - excessive use of disk space

Hello,

I've installed ElasticSearch 7.11.2 on a 3 node cluster on AWS EC2 linux 2.
I need to split a big index (100GB on disk, 1 primary shard and 1 replica shard) to 6 primary shards and 6 replica shards using the split API, Split index API | Elasticsearch Reference [7.12] | Elastic .
what I've done is to run the following curls commands,

curl -X PUT "10.10.201.235:9200/index-old/_settings" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index.blocks.write": true 
  }
}'
curl -X POST "10.10.201.235:9200/index-old/_split/index-new" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index.number_of_shards": 6
  }
}'

Apparently, the split went well and I could see 12 shards were created on the cluster now.

[root@ip-10-10-201-235 ec2-user]# curl 10.10.201.235:9200/_cat/recovery?pretty
.apm-custom-link                0 113ms empty_store    done n/a           n/a    10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.apm-custom-link                0 242ms peer           done 10.10.202.79  node-c 10.10.201.235 node-b n/a n/a 1 1 100.0% 1   208 208 100.0% 208         0  0  100.0%
.kibana-event-log-7.11.2-000001 0 70ms  empty_store    done n/a           n/a    10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.kibana-event-log-7.11.2-000001 0 233ms peer           done 10.10.202.79  node-c 10.10.201.235 node-b n/a n/a 1 1 100.0% 1   208 208 100.0% 208         1  1  100.0%
.kibana_task_manager_1          0 227ms empty_store    done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.kibana_task_manager_1          0 226ms peer           done 10.10.200.30  node-a 10.10.201.235 node-b n/a n/a 1 1 100.0% 1   208 208 100.0% 208         32 32 100.0%
index-new  0 3.2s  peer           done 10.10.201.235 node-b 10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  0 6.6s  existing_store done n/a           n/a    10.10.201.235 node-b n/a n/a 0 0 100.0% 313 0   0   100.0% 98986880554 0  0  100.0%
index-new  1 5.8s  peer           done 10.10.200.30  node-a 10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  1 6.1s  existing_store done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 100.0% 313 0   0   100.0% 98986880554 0  0  100.0%
index-new  2 3.3s  peer           done 10.10.201.235 node-b 10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  2 6.9s  existing_store done n/a           n/a    10.10.201.235 node-b n/a n/a 0 0 100.0% 313 0   0   100.0% 98986880554 0  0  100.0%
index-new  3 5.8s  peer           done 10.10.200.30  node-a 10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  3 5.9s  existing_store done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 100.0% 313 0   0   100.0% 98986880554 0  0  100.0%
index-new  4 174ms peer           done 10.10.201.235 node-b 10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  4 7s    existing_store done n/a           n/a    10.10.201.235 node-b n/a n/a 0 0 100.0% 313 0   0   100.0% 98986880554 0  0  100.0%
index-new  5 160ms peer           done 10.10.201.235 node-b 10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-new  5 580ms existing_store done n/a           n/a    10.10.201.235 node-b n/a n/a 0 0 100.0% 58  0   0   100.0% 10298242743 0  0  100.0%
.apm-agent-configuration        0 66ms  empty_store    done n/a           n/a    10.10.202.79  node-c n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.apm-agent-configuration        0 241ms peer           done 10.10.202.79  node-c 10.10.201.235 node-b n/a n/a 1 1 100.0% 1   208 208 100.0% 208         0  0  100.0%
index-old      0 4.3s  peer           done 10.10.201.235 node-b 10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
index-old      0 5.8s  existing_store done n/a           n/a    10.10.201.235 node-b n/a n/a 0 0 100.0% 256 0   0   100.0% 98952461671 0  0  100.0%
.kibana_2                       0 78ms  empty_store    done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.kibana_2                       0 285ms peer           done 10.10.200.30  node-a 10.10.201.235 node-b n/a n/a 1 1 100.0% 1   208 208 100.0% 208         55 55 100.0%
.kibana_1                       0 575ms peer           done 10.10.200.30  node-a 10.10.202.79  node-c n/a n/a 1 1 100.0% 1   208 208 100.0% 208         50 50 100.0%
.kibana_1                       0 61ms  empty_store    done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%
.tasks                          0 607ms peer           done 10.10.200.30  node-a 10.10.202.79  node-c n/a n/a 1 1 100.0% 1   208 208 100.0% 208         1  1  100.0%
.tasks                          0 60ms  empty_store    done n/a           n/a    10.10.200.30  node-a n/a n/a 0 0 0.0%   0   0   0   0.0%   0           0  0  100.0%

However, the "index-new" with 12 shards is about 5 times bigger on disk than "index-old". This looks very odd to me, and it seems to me that "split" didn't delete the reallocated documents as described below:
"Hashes all documents again, after low level files are created, to delete documents that belong to a different shard. "

[root@ip-10-10-201-235 ec2-user]# curl 10.10.201.235:9200/_cat/indices
green open .apm-custom-link                ObK7Ev8MTTyxpdS5q_AWSg 1 1         0          0    416b    208b
green open .kibana-event-log-7.11.2-000001 ZgA5To0AR-GTlTP8zROkZg 1 1         1          0  11.3kb   5.6kb
green open .kibana_task_manager_1          mKLn1ADIScK9JtWEbKX5sQ 1 1         8         93 217.1kb 136.8kb
green open index-new  9gy-6SZEQCOU6njYv7nzHg 6 1 208839481 1202563158 858.4gb 470.5gb
green open .apm-agent-configuration        e9HPEFXZQ9aTcxsSH8r1rw 1 1         0          0    416b    208b
green open index-old      iLdQR2qyQXGaqAh6Ze6fHA 1 1 208839481   66478753 184.3gb  92.1gb
green open .kibana_2                       mFDNtAIxSbu-5VxT0M2djA 1 1        48         13     4mb     2mb
green open .kibana_1                       09JAJmSVTnab6i_Jkwuwbw 1 1        50          0  57.7kb  28.8kb
green open .tasks                          3Jvfo_qITna3zQEXFNxWKA 1 1         1          0  13.9kb   6.9kb

I wonder if anyone has experienced the similar issue, or did I miss out anything important?

Many thanks,
Landong Zuo

Run a forcemerge on the index.

2 Likes

I'd recommend running:

POST /index-new/_forcemerge?max_num_segments=1

# We have to wait until it's done
GET /_cat/tasks?v

# Check the number of segments
GET /_cat/segments/index-new?v&h=index,shard,prirep,segment,docs.count,size

Thank you both, merge worked well.
Landong