Data is split across shards by hashing the document ID, dividing the hash by the number of shards and taking the remainder.
Shards are allocated to nodes taking a number of factors into account, including disk space. The reference manual has all the details.
New nodes and dead nodes aren't treated particularly specially by the algorithm. Allocation decisions are made assuming that the membership of the cluster is fixed. When an empty node joins the cluster Elasticsearch will relocate some data onto it so that each node holds roughly the same number of shards. If a node fails then the shards it held are distributed among the remaining nodes, although there's a short delay before doing anything in case it comes back.