Should shards and replicas be the same size?


I'm using the Wikipedia river to load all the pages into an index. It creates, by default, 5 shards with 1 replica on 3 nodes. Everything works well, the river works, I can search and retreive wiki pages. It was run only once.

But when I look (with the head plugin) at the shards and replicas distribution and size, I can see one shard with 6.5G of data but the corresponding replica (same number in head) only have like 0.3G. Same for other shards (two primary 6.5 shards and the three other are < 0.4G). Also, the total size for the primary shards far outweight the replicas' size. I looked at the stats 2-3 days after the inital load.

So back to my question... shouldn't they be the same size to insure that if one node goes down, the data is still available?