Shards unbalanced distribution

ES version 6.0.0
8 nodes(5 datanode 3 ingests,all are 8c16g config)
I have a index (5 primary 1 replica) mapping like this

{
  "myindex": {
    "mappings": {
      "order": {
        "properties": {
          "addTime": {
            "type": "date"
          },
          "channelId": {
            "type": "long"
          },
          "content": {
            "type": "text",
          },
          "contentId": {
            "type": "long"
          },
          "status": {
            "type": "long"
          }
        }
      }
    }
  }
}

index has no routingId

but when I using API "GET _cat/shards/myindex?v",data distribution as below:

node's load and threadPool are normal not busy or high.
I have done a experiment like this:

1.fetch about myindex 100 0000 doc_id(related to time not autoid generated by es)
2. mod(doc_id,5),
3.get the distribution of doc_id
4. result is very balance

So anyone can tell me why this happened? It should be balance without routingId in my opinion.
thx for ur time!

@LuckyNemo
If you are expecting step 3 to reflect docs distribution in shards, you need to get distribution of hash(doc_id) using exact hash function used by ElasticSearch.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.