Uneven size per shards


I saw some previous posts about uneven shard distribution but I was unable to get information that I need.
I have an index which has 2 Shards and 1 Replication:

index-name 1 r STARTED 1510504 546.6mb node-name
index-name 1 r STARTED 1510504 458.4mb node-name
index-name 1 p STARTED 1510504 502.9mb node-name
index-name 0 p STARTED 374273705 51.4gb node-name
index-name 0 r STARTED 374273705 52.6gb node-name
index-name 0 r STARTED 374273705 51.4gb node-name

Based from the API explanation, there's no reason to do a rebalance but as we see the docs and size difference is significant. I don't see any important output in the deciders that might affect the result. Below are some result of explanation:

  • "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  • "can_remain_on_current_node" : "yes",
  • "can_rebalance_cluster" : "yes",
  • "can_rebalance_to_other_node" : "no",

Here are some other notable settings:

  • The current index settings mostly use the default settings for cluster allocation.
  • The same index has a balance shard distribution in dev and staging environment. Only the production has this status.
  • We're doing a deletion of docs in this index.
  • We're using custom _id for each docs.

Kindly let me know if there's anymore information needed.
Thank you!

are you using routing? or parent/child or the join type? There must be a reason why one shard only has 500MB and the other has 100x the size.

1 Like

Hi Alexander,

We're only using the default routing provided by Elasticsearch, and not a custom one. Below is the cluster settings config that we have.

      "persistent" : { },
      "transient" : {
        "cluster" : {
          "routing" : {
            "allocation" : {
              "enable" : "all",
              "exclude" : {
                "_ip" : ""

The IP which is excluded doesn't exist anymore.

sorry for being unclear. This was not the routing I am referring to. Are you using the routing parameter when indexing data?

If not, how do your ids look like?

We're not using any routing parameter when indexing data.

Here's some sample of IDs in that index.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.