Shard size is different between primary and replica

We have an index that is divided in 10 primary shards (with 1 replica per shard, so we have 20 in total).

I expected the shards to have roughly the same size, so I plotted a histogram with them:

image

There are 3 shards that have very different sizes, looking closer I noticed the 3 of them were primary shards, and checking at the size of their respective replicas I saw there is a large difference:

| Shard# | Replica/Primary | Size |
| 5      | p               |27.9gb|
| 5      | r               |14.6gb|
| 6      | p               |40.6gb|
| 6      | r               |14.6gb|
| 7      | p               |44.1gb|
| 7      | r               |14.5gb|

I took a look at the document counts of every shard and saw they are all the same, except for these 3 shards, where there is small difference in documents (max 44 documents, and our documents are small, approx. 1 kilobyte each one). I'd like to know which documents are the different ones but I don't know how to do it (information is appreciated).

What can be the cause of this large difference?

BTW: I have read other threads about this topic (here and here) so here is some more information in case it's useful:

  • This index is not being currently heavily written (it was heavily written around 16 hours ago and then we stopped)
  • We don't delete nor update documents, we only insert them

This is how I calculated the sizes and document counts:

$ curl -s 'mycluster/_cat/shards?h=i,s,p,d,sto | grep myindex'

myindex                    0  p 24321381     14gb
myindex                    0  r 24321381     14gb
myindex                    1  p 24313216   13.9gb
myindex                    1  r 24313216   13.9gb
myindex                    2  p 24317009     14gb
myindex                    2  r 24317009   13.9gb
myindex                    3  p 24325100     14gb
myindex                    3  r 24325100   13.9gb
myindex                    4  p 24309484   13.9gb
myindex                    4  r 24309484   13.9gb
myindex                    5  p 24319283   27.9gb <- 
myindex                    5  r 24319280   14.6gb <- 
myindex                    6  p 24318975   40.6gb <- 
myindex                    6  r 24318931   14.6gb <- 
myindex                    7  p 24320967   44.1gb <- 
myindex                    7  r 24320929   14.5gb <- 
myindex                    8  p 24321325   13.9gb
myindex                    8  r 24321325     14gb
myindex                    9  p 24316102   13.9gb
myindex                    9  r 24316102   13.9gb

And these are my index's settings:

$ curl -s 'mycluster/myindex/_settings' | jq .
{
  "myindex": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "data_type": "hot"
            }
          }
        },
        "search": {
          "slowlog": {
            "threshold": {
              "fetch": {
                <redacted>
              },
              "query": {
                <redacted>
              }
            }
          }
        },
        "indexing": {
          "slowlog": {
            "threshold": {
              "index": {
                <redacted>
              }
            }
          }
        },
        "number_of_shards": "10",
        "provided_name": "<redacted>",
        "max_result_window": "50000",
        "creation_date": "<redacted>",
        "analysis": {
          "normalizer": {
            <redacted>
          },
          "analyzer": {
            <redacted>
          }
        },
        "number_of_replicas": "1",
        "uuid": "<redacted>",
        "version": {
          "created": "6020499"
        }
      }
    }
  }
}

ElasticSearch Version:

$ curl -s 'mycluster/'
{
  "name" : "<redacted>',
  "cluster_name" : "<redacted>",
  "cluster_uuid" : "<redacted>",
  "version" : {
    "number" : "6.2.4",
    "build_hash" : "ccec39f",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "<redacted>"
}

It's possible you are affected by https://github.com/elastic/elasticsearch/pull/30244 which was fixed in 6.3.0. Can you upgrade to at least this version?

1 Like

Thanks! Updating to ES 6.4.1 indeed solved the issue.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.