Split API: shard sizing issue post split process


elasticsearch version: 7.3.1

I had an index of size 107GB with a single shard (replicas: 0, primary shards: 1). Since it was too large and messing up the load distribution (based on disk), I decided to use split-api to split it into 10 shards. I used the following query to do that:

POST /my_source_index/_split/my_target_index
  "settings": {
    "index.number_of_shards": 10

The split itself was faster and initially all shards were shown to have 107GB size (in /_cat/shards?v) but afterwards, as expected, most of them had around 10GB size except for two shards:

my_logs-000009-split                1     p      STARTED  106914 107.9gb xx.xx.xx.73 node03
my_logs-000009-split                5     p      STARTED  107611  10.8gb xx.xx.xx.73 node03
my_logs-000009-split                9     p      STARTED  107303  10.8gb xx.xx.xx.71 node01
my_logs-000009-split                7     p      STARTED  107258  10.7gb xx.xx.xx.68 node05
my_logs-000009-split                3     p      STARTED  106939  10.5gb xx.xx.xx.68 node05
my_logs-000009-split                8     p      STARTED  107859    11gb xx.xx.xx.71 node01
my_logs-000009-split                4     p      STARTED  107100  10.8gb xx.xx.xx.62 node04
my_logs-000009-split                2     p      STARTED  107056  11.1gb xx.xx.xx.72 node02
my_logs-000009-split                6     p      STARTED  107179  10.9gb xx.xx.xx.72 node02
my_logs-000009-split                0     p      STARTED  106868 107.9gb xx.xx.xx.62 node04

As you can see shard 0 and 1 still show 107GB size even after 24 hrs! Why is this behavior?

And also, I see original index, that is, my_logs-000009 is still around. Doesn't split api delete it after splitting?


Can you run a force merge on the index and see if that helps?

It does not.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.